Improving Multivariate Time Series Forecasting with Random Walks with Restarts on Causality Graphs

Forecasting models that utilize multiple predictors are gaining popularity in a variety of fields. In some cases they allow constructing more precise forecasting models, leveraging the predictive potential of many variables. Unfortunately, in practice we do not know which observed predictors have a direct impact on the target variable. Moreover, adding unrelated variables may diminish the quality of forecasts. Thus, constructing a set of predictor variables that can be used in a forecast model is one of the greatest challenges in forecasting. We propose a new selection model for predictor variables based on the directed causality graph and a modification of the random walk with restarts model. Experiments conducted using the two popular macroeconomics sets, from the US and Australia, show that this simple and scalable approach performs well compared to other well established methods.

[1]  A. Seth,et al.  Granger causality and transfer entropy are equivalent for Gaussian variables. , 2009, Physical review letters.

[2]  Schreiber,et al.  Measuring information transfer , 2000, Physical review letters.

[3]  Jiuyong Li,et al.  Using causal discovery for feature selection in multivariate numerical time series , 2015, Machine Learning.

[4]  James H. Stock,et al.  Dynamic Factor Models , 2011 .

[5]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[6]  Farshid Vahid,et al.  Macroeconomic forecasting for Australia using a large number of predictors , 2019, International Journal of Forecasting.

[7]  Cyrus Shahabi,et al.  Feature Subset Selection on Multivariate Time Series with Extremely Large Spatial Features , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[8]  Lotfi Lakhal,et al.  A Causality Based Feature Selection Approach for Multivariate Time Series Forecasting , 2017, DBKDA 2017.

[9]  Irena Koprinska,et al.  Correlation and instance based feature selection for electricity load forecasting , 2015, Knowl. Based Syst..

[10]  Mark W. Watson,et al.  Generalized Shrinkage Methods for Forecasting Using Many Predictors , 2012 .

[11]  C. Granger Testing for causality: a personal viewpoint , 1980 .

[12]  H. Akaike A new look at the statistical model identification , 1974 .

[13]  T. Dimpfl,et al.  Using Transfer Entropy to Measure Information Flows Between Financial Markets , 2013 .

[14]  Mark W. Watson,et al.  Chapter 10 Forecasting with Many Predictors , 2006 .

[15]  G. Box Box and Jenkins: Time Series Analysis, Forecasting and Control , 2013 .

[16]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[17]  S. Johansen Estimation and Hypothesis Testing of Cointegration Vectors in Gaussian Vector Autoregressive Models , 1991 .

[18]  Michel Terraza,et al.  Testing for Causality , 1994 .

[19]  Dimitris Kugiumtzis,et al.  Algorithm 1 : mBTS for Require : : The set of time series 1 : initially explanatory vector is empty 2 : : the error variance 3 : the maximum lags , 2016 .

[20]  B. Pompe,et al.  Momentary information transfer as a coupling measure of time series. , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.

[21]  Xiao Zhong,et al.  Forecasting daily stock market return using dimensionality reduction , 2017, Expert Syst. Appl..

[22]  Rob J. Hyndman,et al.  Another Look at Forecast Accuracy Metrics for Intermittent Demand , 2006 .

[23]  Faouzi Boufarès,et al.  Scalable Massively Parallel Learning of Multiple Linear Regression Algorithm with MapReduce , 2015, 2015 IEEE Trustcom/BigDataSE/ISPA.

[24]  Guoqiang Peter Zhang,et al.  Time series forecasting using a hybrid ARIMA and neural network model , 2003, Neurocomputing.