Predictors Extraction in Time Series Using Authorities-Hubs Ranking

Nowadays, modern systems are supposed to store and process large time series. As the number of related variables increases, the prediction of time series becomes more and more complicated, and the use of all variables causes some problems for classical prediction models. In this context, the question that may arise is how to select the most relevant set of predictors for a given target time series. In this paper, we present a prediction process for large multivariate time series forecasting, and propose a novel approach for identifying time series predictors, which analyses the dependencies between predictors and a target variable by using the mutual reinforcement principle between Hubs and Authorities of the Hits (Hyperlink-Induced Topic Search) algorithm, which was originally applied to analyze and rank web pages. The results of our experiments are promising, as the proposed algorithm selects predictors that improve the forecasts of many variables compared to methods currently in use, such as the PCA, Kernel PCA, Factor Analysis, and the FCBF (Fast Correlation Based Filter).

[1]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[2]  Tomaso Aste,et al.  Measures of Causality in Complex Datasets with Application to Financial Data , 2014, Entropy.

[3]  Huan Liu,et al.  Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution , 2003, ICML.

[4]  Michel Terraza,et al.  Testing for Causality , 1994 .

[5]  Ron Kohavi,et al.  Irrelevant Features and the Subset Selection Problem , 1994, ICML.

[6]  Mark W. Watson,et al.  Chapter 10 Forecasting with Many Predictors , 2006 .

[7]  Nikolaos Kourentzes,et al.  Feature selection for time series prediction - A combined filter and wrapper approach for neural networks , 2010, Neurocomputing.

[8]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[9]  Keinosuke Fukunaga,et al.  A Branch and Bound Algorithm for Feature Subset Selection , 1977, IEEE Transactions on Computers.

[10]  Schreiber,et al.  Measuring information transfer , 2000, Physical review letters.

[11]  M. Hallin,et al.  The Generalized Dynamic-Factor Model: Identification and Estimation , 2000, Review of Economics and Statistics.

[12]  C. Granger Testing for causality: a personal viewpoint , 1980 .

[13]  H. Akaike A new look at the statistical model identification , 1974 .

[14]  H. Schneeweiß,et al.  Factor Analysis and Principal Components , 1995 .

[15]  A. E. Hoerl,et al.  Ridge regression: biased estimation for nonorthogonal problems , 2000 .

[16]  Xiao Zhong,et al.  Forecasting daily stock market return using dimensionality reduction , 2017, Expert Syst. Appl..

[17]  Hamidreza Zareipour,et al.  A New Feature Selection Technique for Load and Price Forecast of Electrical Power Systems , 2017, IEEE Transactions on Power Systems.

[18]  S. Johansen STATISTICAL ANALYSIS OF COINTEGRATION VECTORS , 1988 .

[19]  Ke Wang,et al.  Item selection by "hub-authority" profit ranking , 2002, KDD.

[20]  Mark W. Watson,et al.  Generalized Shrinkage Methods for Forecasting Using Many Predictors , 2012 .

[21]  Michael Frankfurter,et al.  Numerical Recipes In C The Art Of Scientific Computing , 2016 .

[22]  Michele Benzi,et al.  MATRIX FUNCTIONS , 2006 .

[23]  Rob J. Hyndman,et al.  Another Look at Forecast Accuracy Metrics for Intermittent Demand , 2006 .

[24]  Johan A. K. Suykens,et al.  Transductive Feature Selection Using Clustering-Based Sample Entropy for Temperature Prediction in Weather Forecasting , 2018, Entropy.

[25]  Swee Chuan Tan,et al.  Time Series Clustering: A Superior Alternative for Market Basket Analysis , 2013, DaEng.

[26]  Farshid Vahid,et al.  Macroeconomic forecasting for Australia using a large number of predictors , 2019, International Journal of Forecasting.

[27]  Irena Koprinska,et al.  Correlation and instance based feature selection for electricity load forecasting , 2015, Knowl. Based Syst..

[28]  I. Jolliffe Principal Component Analysis and Factor Analysis , 1986 .