A direct estimation of high dimensional stationary vector autoregressions

The vector autoregressive (VAR) model is a powerful tool in learning complex time series and has been exploited in many elds. The VAR model poses some unique challenges to researchers: On one hand, the dimensionality, introduced by incorporating multiple numbers of time series and adding the order of the vector autoregression, is usually much higher than the time series length; On the other hand, the temporal dependence structure naturally present in the VAR model gives rise to extra diculties in data analysis. The regular way in cracking the VAR model is via \least squares" and usually involves adding dierent penalty terms (e.g., ridge or lasso penalty) in handling high dimensionality. In this manuscript, we propose an alternative way in estimating the VAR model. The main idea is, via exploiting the temporal dependence structure, formulating the estimating problem to a linear program. There is instant advantage of the proposed approach over the lassotype estimators: The estimation equation can be decomposed to multiple sub-equations and accordingly can be solved eciently using parallel computing. Besides that, we also bring new theoretical insights into the VAR model analysis. So far the theoretical results developed in high dimensions (e.g., Song and Bickel, 2011 and Kock and Callot, 2015) are based on stringent assumptions that are not transparent. Our results, on the other hand, show that the spectral norms of the transition matrices play an important role in estimation accuracy and build estimation and prediction consistency accordingly. Moreover, we provide some experiments on both synthetic and real-world equity data. We show that there are empirical advantages of our method over the lasso-type estimators in parameter estimation and forecasting.

[1]  C. Sims MACROECONOMICS AND REALITY , 1977 .

[2]  Po-Ling Loh,et al.  High-dimensional regression with noisy and missing data: Provable guarantees with non-convexity , 2011, NIPS.

[3]  P. Massart,et al.  Concentration inequalities and model selection , 2007 .

[4]  N. Meinshausen,et al.  High-dimensional graphs and variable selection with the Lasso , 2006, math/0608017.

[5]  Charles R. Johnson,et al.  Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[6]  R. C. Bradley Basic properties of strong mixing conditions. A survey and some open questions , 2005, math/0511078.

[7]  A. Kock,et al.  Oracle Inequalities for High Dimensional Vector Autoregressions , 2012, 1311.0811.

[8]  T. Cai,et al.  A Constrained ℓ1 Minimization Approach to Sparse Precision Matrix Estimation , 2011, 1102.2233.

[9]  Ruey S. Tsay,et al.  Analysis of Financial Time Series , 2005 .

[10]  Han Liu,et al.  Joint estimation of multiple graphical models from high dimensional time series , 2013, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[11]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[12]  Ziv Bar-Joseph,et al.  Analyzing time series gene expression data , 2004, Bioinform..

[13]  Martin J. Wainwright,et al.  Sharp Thresholds for High-Dimensional and Noisy Sparsity Recovery Using $\ell _{1}$ -Constrained Quadratic Programming (Lasso) , 2009, IEEE Transactions on Information Theory.

[14]  DE CS.TU-BERLIN.,et al.  Sparse Causal Discovery in Multivariate Time Series , 2010 .

[15]  P. Bickel,et al.  Covariance regularization by thresholding , 2009, 0901.3079.

[16]  P. Bickel,et al.  Large Vector Auto Regressions , 2011, 1106.3915.

[17]  P. Bickel,et al.  Regularized estimation of large covariance matrices , 2008, 0803.1909.

[18]  Olivier Ledoit,et al.  Improved estimation of the covariance matrix of stock returns with an application to portfolio selection , 2003 .

[19]  Fang Han,et al.  Transition Matrix Estimation in High Dimensional Time Series , 2013, ICML.

[20]  Peng Zhao,et al.  On Model Selection Consistency of Lasso , 2006, J. Mach. Learn. Res..

[21]  Harrison H. Zhou,et al.  Optimal rates of convergence for covariance matrix estimation , 2010, 1010.3866.

[22]  J. H. Ahlberg,et al.  Convergence Properties of the Spline Fit , 1963 .

[23]  Chih-Ling Tsai,et al.  Regression coefficient and autoregressive order shrinkage and selection via the lasso , 2007 .

[24]  Y. Nardi,et al.  Autoregressive process modeling via the Lasso procedure , 2008, J. Multivar. Anal..

[25]  Larry A. Wasserman,et al.  High Dimensional Semiparametric Gaussian Copula Graphical Models. , 2012, ICML 2012.

[26]  J. Varah A lower bound for the smallest singular value of a matrix , 1975 .

[27]  Karl J. Friston,et al.  Modelling Geometric Deformations in Epi Time Series , 2022 .

[28]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[29]  J. Friedman,et al.  Predicting Multivariate Responses in Multiple Linear Regression , 1997 .

[30]  Lester Melie-García,et al.  Estimating brain functional connectivity with sparse multivariate autoregression , 2005, Philosophical Transactions of the Royal Society B: Biological Sciences.

[31]  Helmut Ltkepohl,et al.  New Introduction to Multiple Time Series Analysis , 2007 .

[32]  Martin J. Wainwright,et al.  Minimax Rates of Estimation for High-Dimensional Linear Regression Over $\ell_q$ -Balls , 2009, IEEE Transactions on Information Theory.

[33]  Jing Lei,et al.  Minimax Rates of Estimation for Sparse PCA in High Dimensions , 2012, AISTATS.

[34]  Wolfgang Härdle,et al.  Generalized Dynamic Semi�?Parametric Factor Models for High�?Dimensional Non�?Stationary Time Series , 2014 .

[35]  R. C. Bradley Basic Properties of Strong Mixing Conditions , 1985 .

[36]  Andrea Montanari,et al.  Learning Networks of Stochastic Differential Equations , 2010, NIPS.

[37]  Richard J. Klimoski,et al.  Handbook of psychology: Industrial and organizational psychology, Vol. 12. , 2003 .

[38]  M. D. Dunnette Handbook of Industrial and Organizational Psychology , 2005 .

[39]  Ming Yuan,et al.  High Dimensional Inverse Covariance Matrix Estimation via Linear Programming , 2010, J. Mach. Learn. Res..

[40]  Anders Bredahl Kock,et al.  On the Oracle Property of the Adaptive Lasso in Stationary and Nonstationary Autoregressions , 2012 .

[41]  James D. Hamilton Time Series Analysis , 1994 .

[42]  Xiaoming Yuan,et al.  The flare package for high dimensional linear regression and precision matrix estimation in R , 2020, J. Mach. Learn. Res..

[43]  Yuval Rabani,et al.  Linear Programming , 2007, Handbook of Approximation Algorithms and Metaheuristics.

[44]  Wolfgang Härdle,et al.  High Dimensional Nonstationary Time Series Modelling with Generalized Dynamic Semiparametric Factor Model , 2010 .

[45]  Nan-Jung Hsu,et al.  Subset selection for vector autoregressive processes using Lasso , 2008, Comput. Stat. Data Anal..

[46]  P. Bickel,et al.  SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR , 2008, 0801.1095.

[47]  Terence Tao,et al.  The Dantzig selector: Statistical estimation when P is much larger than n , 2005, math/0506081.

[48]  Naoki Abe,et al.  Grouped graphical Granger modeling for gene expression regulatory networks discovery , 2009, Bioinform..

[49]  C. Granger Investigating causal relations by econometric models and cross-spectral methods , 1969 .

[50]  Martin J. Wainwright,et al.  Estimation of (near) low-rank matrices with noise and high-dimensional scaling , 2009, ICML.

[51]  Ali Shojaie,et al.  Discovering graphical Granger causality using the truncating lasso penalty , 2010, Bioinform..