Skip to content

tot-alin/Regression_Autoencoder_LSMT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Regressive Encoder-Decoder LSTM

The autoregressive model presented below makes predictions regarding the 7-day exchange rate based on a 14-day history for 16 currencies: ['AUD', 'CAD', 'CHF', 'CZK', 'DKK', 'EGP', 'EUR', 'GBP', 'HUF', 'JPY', 'MDL', 'NOK', 'PLN', 'SEK', 'TRY', 'USD']

The database was taken from https://www.bnro.ro/Raport-statistic-606.aspx (04.07.2005 – 25.03.2025) and is adapted for the model, being a multivariate time series 4981 x 16. he architecture of this model is an autoencoder in which there are five layers of Long Short-Term Memory recurrent neural networks in the encoder block, which compresses and delivers the information to the decoder, also consisting of five layers of Long Short-Term Memory recurrent neural networks.

The model's operating concept is based on a recursive strategy, so the data set was created with a time step equal to the number of prediction days, i.e., 7.

Time series dataset (TSDB)

A time series database is a database designed to store, manage, and analyze information collected in chronological order. Each element is associated with a time value, which allows the database to handle the special characteristics of time series data. TSDBs are created for time series such as real-time monitoring, historical analysis and predictive forecasting for applications such as IoT, financial trading and IT infrastructure monitoring.

Multivariate Time Series (MTS)

A multivariate time series (MTS) is a database with two or more variables correlated over the same time interval. This approach indicates the interactions and dependencies between multiple time series, for example, the relationship between temperature, humidity, and energy consumption recorded simultaneously over the same time interval. MTS provides a more comprehensive approach than the isolated analysis of each variable (univariate time series).

Recursive strategy

The recursive strategy refers to the fact that it uses a single model to forecast data one step ahead. At each forecasting stage, the model predicts one step ahead and then the next set of chronologically arranged features (similar to a recursive function) and so on, until the desired forecast horizon is reached.

For the model in question, a database was created with the form illustrated in the diagram below. In this case, a multivariate time series database was used, and the division was performed with 7 labels, resulting in a 7 x 16 matrix and 17 input features, i.e., a 14 x 16 matrix.

grafic_dataset

Time series cross-validation (TSCV)

Cross-validation of time series is a method for training and/or validating time series forecasting models. The principle of this approach consists of repeatedly dividing the chronological data into training and validation sets, always keeping the validation set after the training set. As shown in the diagram below, the data set is divided into several folds of increasing size. The case study has 10 sets (folds) and their sizes increase from the first fold, which has a training dataset X(556 x 14 x 16) Y(556 x 14 x 16), validation data set X(15 x 14 x 16) Y( 15 x 14 x 16) and ending with the last fold, which has the training dataset X( 691 x 14 x 16) Y( 691 x 14 x 16), validation data set X(15 x 14 x 16) Y( 15 x 14 x 16)

grafic_tscv

Regressive Encoder-Decoder Model

An autoregressive model is a type of model that automatically makes predictions for the next component in a sequence based on previous inputs in that sequence. Autoregression is a statistical technique used in time series analysis that assumes that the current value of a time series is a function of its previous values.

Autoregressive Encoder-Decoder is a hybrid approach that integrates the analytical capabilities of autoregressive models into both the encoding and decoding processes.

The model addressed in this project is illustrated in the graph below and consists of the three elements of an autoencoder, namely: encoder, compressed data, and decoder. The prediction of this model is a multivariate time series consisting of 7 time steps with 16 predictions per time step, and the input data is also a multivariate time series consisting of 14 time steps with 16 features per time step.

The architecture of the model consists of:

  • Encoder consisting of the input layer and five layers interconnected successively by LSTM recurrent neural networks
  • The compressed state consists of the hidden state, i.e., the cell state on each SLTM layer, and the encoder output (the last step of the SLTM L1 network) multiplied by the number of time steps corresponding to the prediction.
  • The decoder consists of five layers interconnected successively by LSTM recurrent neural networks, of which the last layer, L1, has 16 outputs corresponding to the number of predictions (multivariate time series).
  • Except for layer L1, the rest of the layers may have other waves in terms of the number of outputs.
grafic_autoencoder

Loss

The following graphs show the loss functions resulting from training for each data set obtained from TSCV. The variation in the number of epochs on the X axis is due to the use of the EarlyStopping method, which stops the learning process when the monitored parameters no longer indicate progress. ( val_mean_absolute_percentage_error, in this case )

loss_fold_1 loss_fold_2 loss_fold_3 loss_fold_4 loss_fold_5 loss_fold_6 loss_fold_7 loss_fold_8 loss_fold_9 loss_fold_10

Model Performance

The diagram below shows the model performance using the validation data for each fold. model_performance

Forecast

The model predictions made on the test data are shown below. The test data is a composite of the 3 most recent sets of data, in chronological order. The chart below also shows the average percentage error for each test set. forecast_AUD forecast_CAD forecast_CHF forecast_CZK forecast_DKK forecast_EGP forecast_EUR forecast_GBP forecast_HUF forecast_JPY forecast_MDL forecast_NOK forecast_PLN forecast_SEK forecast_TRY forecast_USD

Bibliography

https://medium.com/data-science/ml-time-series-forecasting-the-right-way-cbf3678845ff

https://www.tigerdata.com/learn/recursive-query-in-sql-what-it-is-and-how-to-write-one

https://www.diva-portal.org/smash/get/diva2:1135425/FULLTEXT01.pdf

https://skforecast.org/0.9.1/user_guides/autoregresive-forecaster

https://medium.com/intuition/statistics-multivariate-time-series-analysis-vma-var-varma-3cb1fbac5553

https://www.geeksforgeeks.org/r-language/multivariate-time-series-modelling-in-r/

https://skforecast.org/0.8.0/user_guides/dependent-multi-series-multivariate-forecasting

https://robjhyndman.com/hyndsight/tscv/

https://docs.h2o.ai/driverless-ai/1-11-lts/docs/userguide/time-series.html

https://www.tigerdata.com/blog/time-series-database-an-explainer

https://en.wikipedia.org/wiki/Time_series_database

https://www.kaggle.com/code/iamleonie/intro-to-time-series-forecasting

https://medium.com/@naveennjn1729/a-quick-introduction-to-time-series-forecasting-b1845beae9b4

https://www.mdpi.com/1424-8220/21/7/2430

https://medium.com/@anusaid/understanding-encoder-decoder-and-autoregressive-models-in-ai-8da6ce9d4901

https://www.geeksforgeeks.org/machine-learning/time-series-analysis-and-forecasting/

https://www.springboard.com/blog/data-science/time-series-forecasting/

https://pub.aimind.so/k-fold-cross-validation-with-keras-191816d22c96

https://www.geeksforgeeks.org/machine-learning/time-series-cross-validation/

https://medium.com/@soumyachess1496/cross-validation-in-time-series-566ae4981ce4

https://lindevs.com/calculate-mean-absolute-percentage-error-using-tensorflow-2

https://neuralbrainworks.com/lstm-seq2seq-tutorial-build-encoder-decoder-models/

About

Regressive Encoder-Decoder LSTM

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors