Backward‐in‐Time Selection of the Order of Dynamic Regression Prediction Model

We investigate the optimal structure of dynamic regression models used in multivariate time series prediction and propose a scheme to form the lagged variable structure called Backward-in-Time Selection (BTS) that takes into account feedback and multi-collinearity, often present in multivariate time series. We compare BTS to other known methods, also in conjunction with regularization techniques used for the estimation of model parameters, namely principal components, partial least squares and ridge regression estimation. The predictive efficiency of the different models is assessed by meansof Monte Carlo simulations for different settings of feedback and multi-collinearity. The results show that BTS has consistently good prediction performance while other popular methods have varying and often inferior performance. The prediction performance of BTS was also found the best when tested on human electroencephalograms of an epileptic seizure, and to the prediction of returns of indices of world financial markets.

[1]  I. Helland ON THE STRUCTURE OF PARTIAL LEAST SQUARES REGRESSION , 1988 .

[2]  Dirk Van den Poel,et al.  Customer attrition analysis for financial services using proportional hazard models , 2004, Eur. J. Oper. Res..

[3]  Charles A. Ingene,et al.  Specification Searches: Ad Hoc Inference with Nonexperimental Data , 1980 .

[4]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[5]  Victor Jupp,et al.  Data Collection and Analysis , 2012, Lean Six Sigma for the Office.

[6]  J. Bentzen,et al.  A revival of the autoregressive distributed lag model in estimating energy demand relationships , 1999 .

[7]  C. Sims MACROECONOMICS AND REALITY , 1977 .

[8]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[9]  Tom Fawcett,et al.  ROC Graphs: Notes and Practical Considerations for Researchers , 2007 .

[10]  L. Breiman,et al.  Submodel selection and evaluation in regression. The X-random case , 1992 .

[11]  Helmut Ltkepohl,et al.  New Introduction to Multiple Time Series Analysis , 2007 .

[12]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[13]  James D. Hamilton Time Series Analysis , 1994 .

[14]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[15]  A. E. Hoerl,et al.  Ridge regression: biased estimation for nonorthogonal problems , 2000 .

[16]  D. Johnston,et al.  The relationship between cardiovascular reactivity in the laboratory and heart rate response in real life: active coping and beta blockade. , 1994, Psychosomatic medicine.

[17]  Klaus Nordhausen,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition by Trevor Hastie, Robert Tibshirani, Jerome Friedman , 2009 .

[18]  Ole Christian Lingjærde,et al.  Shrinkage Structure of Partial Least Squares , 2000 .

[19]  I. Jolliffe A Note on the Use of Principal Components in Regression , 1982 .

[20]  Daniel Peña,et al.  Measuring the Advantages of Multivariate vs. Univariate Forecasts , 2007 .

[21]  Alan Pankratz,et al.  Forecasting with Dynamic Regression Models: Pankratz/Forecasting , 1991 .

[22]  Adrian E. Raftery,et al.  Bayesian model averaging: a tutorial (with comments by M. Clyde, David Draper and E. I. George, and a rejoinder by the authors , 1999 .

[23]  C. Hsiao Autoregressive modeling and causal ordering of economic variables , 1982 .

[24]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[25]  I. Jolliffe Principal Component Analysis , 2002 .

[26]  Jurgen A. Doornik,et al.  Evaluating Automatic Model Selection , 2011 .

[27]  T. Lai Time series analysis univariate and multivariate methods , 1991 .

[28]  M. Zweig,et al.  Receiver-operating characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine. , 1993, Clinical chemistry.

[29]  R. R. Hocking The analysis and selection of variables in linear regression , 1976 .

[30]  Raquel Prado,et al.  Multichannel electroencephalographic analyses via dynamic regression models with time‐varying lag–lead structure , 2001 .

[31]  Dynamic Modeling of Multivariate Time Series for Use in Bank Analysis , 1976 .

[32]  Jonathan D. Cryer,et al.  Time Series Analysis , 1986 .

[33]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[34]  Steven C. Hillmer,et al.  Multicollinearity Problems in Modeling Time Series with Trading-Day Variation , 1987 .

[35]  O. Lingjærde,et al.  Regularized local linear prediction of chaotic time series , 1998 .

[36]  F. Diebold,et al.  Comparing Predictive Accuracy , 1994, Business Cycles.

[37]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[38]  M. A. Wincek Forecasting With Dynamic Regression Models , 1993 .

[39]  P. R. Welch A GENERALIZED DISTRIBUTED LAG MODEL FOR PREDICTING QUARTERLY EARNINGS , 1984 .