Learning Predictive Leading Indicators for Forecasting Time Series Systems with Unknown Clusters of Forecast Tasks

We present a new method for forecasting systems of multiple interrelated time series. The method learns the forecast models together with discovering leading indicators from within the system that serve as good predictors improving the forecast accuracy and a cluster structure of the predictive tasks around these. The method is based on the classical linear vector autoregressive model (VAR) and links the discovery of the leading indicators to inferring sparse graphs of Granger causality. We formulate a new constrained optimisation problem to promote the desired sparse structures across the models and the sharing of information amongst the learning tasks in a multi-task manner. We propose an algorithm for solving the problem and document on a battery of synthetic and real-data experiments the advantages of our new method over baseline VAR models as well as the state-of-the-art sparse VAR learning methods.

[1]  Massimiliano Pontil,et al.  Multi-Task Feature Learning , 2006, NIPS.

[2]  Jean-Philippe Vert,et al.  Clustered Multi-Task Learning: A Convex Formulation , 2008, NIPS.

[3]  Xinsheng Zhang,et al.  Two-step adaptive model selection for vector autoregressive processes , 2013, J. Multivar. Anal..

[4]  G. Koop Forecasting with Medium and Large Bayesian VARs , 2013 .

[5]  Aurelie C. Lozano,et al.  Multi-level Lasso for Sparse Multi-task Regression , 2012, ICML.

[6]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[7]  Bernhard Schölkopf,et al.  Causal Inference on Time Series using Restricted Structural Equation Models , 2013, NIPS.

[8]  Arthur E. Hoerl,et al.  Ridge Regression: Biased Estimation for Nonorthogonal Problems , 2000, Technometrics.

[9]  Jitkomut Songsiri Sparse autoregressive model estimation for learning granger causality in time series , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[10]  Mark W. Watson,et al.  Generalized Shrinkage Methods for Forecasting Using Many Predictors , 2012 .

[11]  Lawrence Carin,et al.  Multi-Task Learning for Classification with Dirichlet Process Priors , 2007, J. Mach. Learn. Res..

[12]  Anastasios Kyrillidis,et al.  Finding low-rank solutions to smooth convex problems via the Burer-Monteiro approach , 2016, 2016 54th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[13]  Ali Jalali,et al.  Learning the Dependence Graph of Time Series with Latent Factors , 2011, ICML.

[14]  Jitkomut Songsiri,et al.  Learning multiple granger graphical models via group fused lasso , 2015, 2015 10th Asian Control Conference (ASCC).

[15]  Naoki Abe,et al.  Grouped graphical Granger modeling for gene expression regulatory networks discovery , 2009, Bioinform..

[16]  Jeff G. Schneider,et al.  Learning Bi-clustered Vector Autoregressive Models , 2012, ECML/PKDD.

[17]  Y. Matsuda Graphical modelling for multivariate time series , 2004 .

[18]  Kamin Whitehouse,et al.  High-dimensional Time Series Clustering via Cross-Predictability , 2017, AISTATS.

[19]  C. Granger Investigating causal relations by econometric models and cross-spectral methods , 1969 .

[20]  Ali Shojaie,et al.  Discovering graphical Granger causality using the truncating lasso penalty , 2010, Bioinform..

[21]  Kristen Grauman,et al.  Learning with Whom to Share in Multi-task Feature Learning , 2011, ICML.

[22]  Julien Mairal,et al.  Structured sparsity through convex optimization , 2011, ArXiv.

[23]  Helmut Ltkepohl,et al.  New Introduction to Multiple Time Series Analysis , 2007 .

[24]  Robert B. Litterman,et al.  Forecasting and Conditional Projection Using Realistic Prior Distributions , 1983 .

[25]  Hal Daumé,et al.  Learning Task Grouping and Overlap in Multi-task Learning , 2012, ICML.