Sparse Identification and Estimation of Large-Scale Vector AutoRegressive Moving Averages

Abstract The vector autoregressive moving average (VARMA) model is fundamental to the theory of multivariate time series; however, identifiability issues have led practitioners to abandon it in favor of the simpler but more restrictive vector autoregressive (VAR) model. We narrow this gap with a new optimization-based approach to VARMA identification built upon the principle of parsimony. Among all equivalent data-generating models, we use convex optimization to seek the parameterization that is simplest in a certain sense. A user-specified strongly convex penalty is used to measure model simplicity, and that same penalty is then used to define an estimator that can be efficiently computed. We establish consistency of our estimators in a double-asymptotic regime. Our nonasymptotic error bound analysis accommodates both model specification and parameter estimation steps, a feature that is crucial for studying large-scale VARMA algorithms. Our analysis also provides new results on penalized estimation of infinite-order VAR, and elastic net regression under a singular covariance structure of regressors, which may be of independent interest. We illustrate the advantage of our method over VAR alternatives on three real data examples.

[1]  Jean-Marie Dufour,et al.  Practical Methods for Modeling Weak VARMA Processes: Identification, Estimation and Specification With a Macroeconomic Application , 2021, Journal of Business & Economic Statistics.

[2]  Cees G. M. Snoek,et al.  Variable Selection , 2019, Model-Based Clustering and Classification for Data Science.

[3]  George Michailidis,et al.  Low Rank and Structured Modeling of High-Dimensional Vector Autoregressions , 2018, IEEE Transactions on Signal Processing.

[4]  Sumanta Basu,et al.  Large Spectral Density Matrix Estimation by Thresholding , 2018, 1812.00532.

[5]  Christian L. Müller,et al.  Prediction error bounds for linear regression with the TREX , 2018, TEST.

[6]  L. Kilian,et al.  Structural VAR Analysis in a Data-Rich Environment , 2017 .

[7]  L. Kilian,et al.  Structural Vector Autoregressive Analysis , 2017 .

[8]  David S. Matteson,et al.  Interpretable vector autoregressions with exogenous time series , 2017, 1711.03623.

[9]  David S. Matteson,et al.  Sparse Estimation of Large Time Series Models [R package bigtime version 0.1.0] , 2017 .

[10]  Kam Chung Wong,et al.  Lasso guarantees for $\beta$-mixing heavy-tailed time series , 2017, The Annals of Statistics.

[11]  Thierry Magnac,et al.  Set Identification, Moment Restrictions and Inference , 2017 .

[12]  G. Kapetanios,et al.  Estimation and Forecasting in Vector Autoregressive Moving Average Models for Rich Datasets , 2017 .

[13]  Mark W. Watson,et al.  Twenty Years of Time Series Econometrics in Ten Pictures , 2017 .

[14]  Irina Gaynanova,et al.  Oracle inequalities for high-dimensional prediction , 2016, Bernoulli.

[15]  G. Koop,et al.  Large Bayesian VARMAs , 2016 .

[16]  D. Poskitt Vector autoregressive moving average identification for macroeconomic modeling: A new methodology , 2016 .

[17]  J. Bien,et al.  Hierarchical Sparse Modeling: A Choice of Two Group Lasso Formulations , 2015, 1512.01631.

[18]  William B. Nicholson,et al.  VARX-L: Structured Regularization for Large Vector Autoregressions with Exogenous Variables , 2015, 1508.07497.

[19]  H. Ombao,et al.  Detection of Changes in Multivariate Time Series With Application to EEG Data , 2015 .

[20]  Christophe Croux,et al.  Identifying demand effects in a large network of product categories , 2015, 1506.01589.

[21]  William B. Nicholson,et al.  High Dimensional Forecasting via Interpretable Vector Autoregression , 2014, J. Mach. Learn. Res..

[22]  William B. Nicholson,et al.  Hierarchical Vector Autoregression , 2014 .

[23]  Jean-Marie Dufour,et al.  Asymptotic distributions for quasi-efficient estimators in echelon VARMA models , 2014, Comput. Stat. Data Anal..

[24]  Dennis L. Sun,et al.  Exact post-selection inference, with application to the lasso , 2013, 1311.6238.

[25]  G. Michailidis,et al.  Regularized estimation in sparse high-dimensional time series models , 2013, 1311.4175.

[26]  P. Brockwell,et al.  Time Series: Theory and Methods , 2013 .

[27]  Fang Han,et al.  A direct estimation of high dimensional stationary vector autoregressions , 2013, J. Mach. Learn. Res..

[28]  Adel Javanmard,et al.  Confidence intervals and hypothesis testing for high-dimensional regression , 2013, J. Mach. Learn. Res..

[29]  S. Geer,et al.  On asymptotically optimal confidence regions and tests for high-dimensional models , 2013, 1303.0518.

[30]  G. Koop Forecasting with Medium and Large Bayesian VARs , 2013 .

[31]  Richard A. Davis,et al.  Sparse Vector Autoregressive Modeling , 2012, 1207.0520.

[32]  A. Kock,et al.  Oracle Inequalities for High Dimensional Vector Autoregressions , 2012, 1311.0811.

[33]  R. Tibshirani The Lasso Problem and Uniqueness , 2012, 1206.0313.

[34]  Christian Kascha,et al.  A Comparison of Estimation Methods for Vector Autoregressive Moving-Average Models , 2012 .

[35]  Farshid Vahid,et al.  Two Canonical VARMA Forms: Scalar Component Models Vis-à-Vis the Echelon Form , 2012 .

[36]  David S. Matteson,et al.  Dynamic Orthogonal Components for Multivariate Time Series , 2011 .

[37]  Po-Ling Loh,et al.  High-dimensional regression with noisy and missing data: Provable guarantees with non-convexity , 2011, NIPS.

[38]  Martin J. Wainwright,et al.  Fast global convergence rates of gradient methods for high-dimensional statistical recovery , 2010, NIPS.

[39]  Elie Tamer,et al.  Partial Identification in Econometrics , 2010 .

[40]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[41]  Stephen P. Boyd,et al.  Convex Optimization , 2004, IEEE Transactions on Automatic Control.

[42]  P. Zhao,et al.  The composite absolute penalties family for grouped and hierarchical variable selection , 2009, 0909.0411.

[43]  J. Bai,et al.  Large Dimensional Factor Analysis , 2008 .

[44]  G. Athanasopoulos,et al.  A complete VARMA modelling methodology based on scalar components , 2008 .

[45]  Farshid Vahid,et al.  VARMA versus VAR for Macroeconomic Forecasting , 2008 .

[46]  Nan-Jung Hsu,et al.  Subset selection for vector autoregressive processes using Lasso , 2008, Comput. Stat. Data Anal..

[47]  P. Bickel,et al.  SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR , 2008, 0801.1095.

[48]  Helmut Ltkepohl,et al.  New Introduction to Multiple Time Series Analysis , 2007 .

[49]  T. Sargent,et al.  ABCs (and Ds) of Understanding VARs , 2007 .

[50]  Peng Zhao,et al.  On Model Selection Consistency of Lasso , 2006, J. Mach. Learn. Res..

[51]  C. De Mol,et al.  Forecasting Using a Large Number of Predictors: Is Bayesian Regression a Valid Alternative to Principal Components? , 2006, SSRN Electronic Journal.

[52]  C. Granger,et al.  Handbook of Economic Forecasting , 2006 .

[53]  J. D. Leeuw High-Dimensional Regression , 2005 .

[54]  James Franklin The elements of statistical learning: data mining, inference and prediction , 2005 .

[55]  Jean-Marie Dufour,et al.  Asymptotic Distribution of a Simple Linear Estimator for Varma Models in Echelon Form , 2005 .

[56]  Y. Ritov,et al.  Persistence in high-dimensional linear predictor selection and the virtue of overparametrization , 2004 .

[57]  W. Härdle,et al.  Bootstrap Methods for Time Series , 2003 .

[58]  J. Stock,et al.  Forecasting Using Principal Components From a Large Number of Predictors , 2002 .

[59]  Jon Faust,et al.  The Robustness of Identified VAR Conclusions About Money , 1998 .

[60]  Thomas F. Cooley,et al.  Business cycle analysis without much theory A look at structural VARs , 1998 .

[61]  Bent E. Sørensen,et al.  Finding Cointegration Rank in High Dimensional Systems Using the Johansen Test: An Illustration Using Data Based Monte Carlo Simulations , 1996 .

[62]  F. Diebold,et al.  Comparing Predictive Accuracy , 1994, Business Cycles.

[63]  Donald Poskitt,et al.  Identification of Echelon Canonical Forms for Vector Linear Processes Using Least Squares , 1992 .

[64]  G. Reinsel,et al.  Prediction of multivariate time series by autoregressive model fitting , 1985 .

[65]  E. J. Hannan,et al.  Multivariate linear time series models , 1984, Advances in Applied Probability.

[66]  Henrik Spliid,et al.  A Fast Estimation Method for the Vector Autoregressive Moving Average Model with Exogenous Variables , 1983 .

[67]  K. Wallis Multiple Time Series Analysis and the Final Form of Econometric Models , 1977 .

[68]  E. Hannan The Identification and Parameterization of ARMAX and State Space Forms , 1976 .

[69]  H. Akaike A new look at the statistical model identification , 1974 .

[70]  A. Zellner,et al.  Time series analysis and simultaneous equation econometric models , 1974 .

[71]  E. J. Hannan,et al.  Multivariate time series analysis , 1973 .

[72]  Y. Wu,et al.  Performance bounds for parameter estimates of high-dimensional linear models with correlated errors , 2016 .

[73]  F. Diebold,et al.  UNIVERSITY OF SOUTHERN CALIFORNIA Center for Applied Financial Economics (CAFE) On the Network Topology of Variance Decompositions: Measuring the Connectedness of Financial Firms , 2011 .

[74]  D. Giannone,et al.  Large Bayesian vector auto regressions , 2010 .

[75]  Charles F. Manski,et al.  Partial Identification in Econometrics , 2010 .

[76]  Michael H. Böhlen,et al.  Network Topology , 2009, Encyclopedia of Database Systems.

[77]  Helmut Lütkepohl,et al.  Chapter 6 Forecasting with VARMA Models , 2006 .

[78]  Helmut Luetkepohl,et al.  Forecasting with VARMA Models , 2004 .

[79]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2001, Springer Series in Statistics.

[80]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[81]  G. C. Tiao,et al.  Model Specification in Multivariate Time Series , 1989 .

[82]  M. Deistler,et al.  9 General structure and parametrization of ARMA and state-space systems and its relation to statistical problems , 1985 .

[83]  H. Akaike Canonical Correlation Analysis of Time Series and the Use of an Information Criterion , 1976 .