Advances in sensor, computing, and communication technologies are enabling big data analytics by providing time series data. However, conventional models struggle to identify sequence features and forecast accuracy. This paper investigates time series features and shows that some machine learning algorithms can outperform deep learning models. In particular, the problem analyzed concerned predicting the number of vehicles passing through an Italian tollbooth in 2021. The dataset, composed of 8766 rows and 6 columns relating to additional tollbooths, proved to have high stationarity and was treated through machine learning methods such as support vector machine, random forest, and eXtreme gradient boosting (XGBoost), as well as deep learning through recurrent neural networks with long short-term memory (RNN-LSTM) cells. From the comparison of these models, the prediction through the XGBoost algorithm outperforms competing algorithms, particularly in terms of MAE and MSE. The result highlights how a shallower algorithm than a neural network is, in this case, able to obtain a better adaptation to the time series instead of a much deeper model that tends to develop a smoother prediction.
A comparison between machine and deep learning models on high stationarity data / Santoro, D.; Ciano, T.; Ferrara, M.. - In: SCIENTIFIC REPORTS. - ISSN 2045-2322. - 14:1(2024), pp. 1-11. [10.1038/s41598-024-70341-6]
A comparison between machine and deep learning models on high stationarity data
Ferrara M.Conceptualization
2024-01-01
Abstract
Advances in sensor, computing, and communication technologies are enabling big data analytics by providing time series data. However, conventional models struggle to identify sequence features and forecast accuracy. This paper investigates time series features and shows that some machine learning algorithms can outperform deep learning models. In particular, the problem analyzed concerned predicting the number of vehicles passing through an Italian tollbooth in 2021. The dataset, composed of 8766 rows and 6 columns relating to additional tollbooths, proved to have high stationarity and was treated through machine learning methods such as support vector machine, random forest, and eXtreme gradient boosting (XGBoost), as well as deep learning through recurrent neural networks with long short-term memory (RNN-LSTM) cells. From the comparison of these models, the prediction through the XGBoost algorithm outperforms competing algorithms, particularly in terms of MAE and MSE. The result highlights how a shallower algorithm than a neural network is, in this case, able to obtain a better adaptation to the time series instead of a much deeper model that tends to develop a smoother prediction.File | Dimensione | Formato | |
---|---|---|---|
Ferrara_2024_ SR_Stationarity data_editor.pdf
accesso aperto
Tipologia:
Versione Editoriale (PDF)
Licenza:
Creative commons
Dimensione
3.37 MB
Formato
Adobe PDF
|
3.37 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.