A Theory-Based Lasso for Time-Series Data

We present two new lasso estimators, the HAC-lasso and AC-lasso, that are suitable for time-series applications. The estimators are variations of the theory-based or ‘rigorous’ lasso of Bickel et al. (2009), Belloni et al. (2011), Belloni and Chernozhukov (2013), Belloni et al. (2016) and recently extended to the case of dependent data by Chernozhukov et al. (2019), where the lasso penalty level is derived on theoretical grounds. The rigorous lasso has appealing theoretical properties and is computationally very attractive compared to conventional cross-validation. The AC-lasso version of the rigorous lasso accommodates dependence in the disturbance term of arbitrary form, so long as the dependence is known to die out after q periods; the HAC-lasso also allows for heteroskedasticity of arbitrary form. The HAC- and AC-lasso are particularly well-suited to applications such as nowcasting, where the time series may be short and the dimensionality of the predictors is high. We present some Monte Carlo comparisons of the performance of the HAC-lasso versus penalty selection by cross-validation approach. Finally, we use the HAC-lasso to estimate a nowcasting model of US GDP growth based on Google Trends data and compare its performance to the Bayesian methods employed by Kohns and Bhattacharjee (2019).

[1]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[2]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[3]  Susan Athey,et al.  The Impact of Machine Learning on Economics , 2018, The Economics of Artificial Intelligence.

[4]  H. Akaike Autoregressive model fitting for control , 1971 .

[5]  John Bock Quantifying Macroeconomic Expectations in Stock Markets Using Google Trends , 2018, 1805.00268.

[6]  A. Belloni,et al.  Inference on Treatment Effects after Selection Amongst High-Dimensional Controls , 2011, 1201.0224.

[7]  A. Belloni,et al.  Inference for High-Dimensional Sparse Econometric Models , 2011, 1201.0220.

[8]  Edmond Chow,et al.  A cross-validatory method for dependent data , 1994 .

[9]  Bing-Yi Jing,et al.  Self-normalized Cramér-type large deviations for independent random variables , 2003 .

[10]  Hal R. Varian,et al.  Big Data: New Tricks for Econometrics , 2014 .

[11]  Jonathan H. Wright,et al.  A Survey of Weak Instruments and Weak Identification in Generalized Method of Moments , 2002 .

[12]  Y. Nardi,et al.  Autoregressive process modeling via the Lasso procedure , 2008, J. Multivar. Anal..

[13]  Hui Miao Model selection and estimation in additive regression models , 2009 .

[14]  L. Wasserman,et al.  HIGH DIMENSIONAL VARIABLE SELECTION. , 2007, Annals of statistics.

[15]  D. Romer,et al.  Federal Reserve Information and the Behavior of Interest Rates , 2000 .

[16]  Peter Buhlmann Statistical significance in high-dimensional linear models , 2012, 1202.1377.

[17]  Mark E. Schaffer,et al.  lassopack: Model selection and prediction with regularized regression in Stata , 2019, 1901.05397.

[18]  Gary Koop,et al.  Macroeconomic Nowcasting Using Google Probabilities☆ , 2016, Advances in Econometrics.

[19]  Anna Simoni,et al.  When Are Google Data Useful to Nowcast GDP? An Approach via Pre-Selection and Shrinkage , 2019 .

[20]  H. Akaike A new look at the statistical model identification , 1974 .

[21]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[22]  Lorenzo Rosasco,et al.  Elastic-net regularization in learning theory , 2008, J. Complex..

[23]  Eric Ghysels,et al.  Série Scientifique Scientific Series the Midas Touch: Mixed Data Sampling Regression Models the Midas Touch: Mixed Data Sampling Regression Models* , 2022 .

[24]  Steven L. Scott,et al.  Predicting the Present with Bayesian Structural Time Series , 2013 .

[25]  Domenico Giannone,et al.  Economic Predictions with Big Data: The Illusion of Sparsity , 2017, Econometrica.

[26]  Helmut Ltkepohl,et al.  New Introduction to Multiple Time Series Analysis , 2007 .

[27]  Sylvain Arlot,et al.  A survey of cross-validation procedures for model selection , 2009, 0907.4728.

[28]  Victor Chernozhukov,et al.  On cross-validated Lasso in high dimensions , 2020 .

[29]  Victor Chernozhukov,et al.  High Dimensional Sparse Econometric Models: An Introduction , 2011, 1106.5242.

[30]  Christian Hansen,et al.  Inference in High-Dimensional Panel Models With an Application to Gun Control , 2014, 1411.6507.

[31]  Runze Li,et al.  Regularization Parameter Selections via Generalized Information Criterion , 2010, Journal of the American Statistical Association.

[32]  H. Varian,et al.  Predicting the Present with Google Trends , 2012 .

[33]  R. Tibshirani,et al.  A SIGNIFICANCE TEST FOR THE LASSO. , 2013, Annals of statistics.

[34]  Christopher A. Sims,et al.  The Role of Models and Probabilities in the Monetary Policy Process , 2002 .

[35]  J. Friedman,et al.  A Statistical View of Some Chemometrics Regression Tools , 1993 .

[36]  B. G. Quinn,et al.  The determination of the order of an autoregression , 1979 .

[37]  Rob J. Hyndman,et al.  A note on the validity of cross-validation for evaluating autoregressive time series prediction , 2018, Comput. Stat. Data Anal..

[38]  Robert B. Litterman,et al.  Forecasting and Conditional Projection Using Realistic Prior Distributions , 1983 .

[39]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[40]  Domenico Giannone,et al.  Exploiting the Monthly Data Flow in Structural Forecasting , 2015 .

[41]  Paul Smith,et al.  Google's MIDAS Touch: Predicting UK Unemployment with Internet Search Data , 2015 .

[42]  Peter Bühlmann,et al.  p-Values for High-Dimensional Regression , 2008, 0811.2177.

[43]  A. Belloni,et al.  SPARSE MODELS AND METHODS FOR OPTIMAL INSTRUMENTS WITH AN APPLICATION TO EMINENT DOMAIN , 2012 .

[44]  Massimiliano Marcellino,et al.  A comparison of mixed frequency approaches for nowcasting Euro area macroeconomic aggregates , 2014 .

[45]  D. Giannone,et al.  Large Bayesian VARs , 2008, SSRN Electronic Journal.

[46]  Fabian J. Theis,et al.  TREVOR HASTIE, ROBERT TIBSHIRANI, AND MARTIN WAINWRIGHT. Statistical Learning with Sparsity: The Lasso and Generalizations. Boca Raton: CRC Press. , 2018, Biometrics.

[47]  Robert B. Litterman Forecasting with Bayesian Vector Autoregressions-Five Years of Experience , 1984 .

[48]  Dean Croushore,et al.  Forecasting with Real-Time Macroeconomic Data , 2006 .

[49]  Jure Leskovec,et al.  Human Decisions and Machine Predictions , 2017, The quarterly journal of economics.

[50]  John H. Gerdes,et al.  Using web-based search data to predict macroeconomic statistics , 2005, CACM.

[51]  H. Varian,et al.  Predicting the Present with Google Trends , 2009 .

[52]  R. Tibshirani,et al.  Sparsity and smoothness via the fused lasso , 2005 .

[53]  Yuhong Yang Can the Strengths of AIC and BIC Be Shared , 2005 .

[54]  Christian Hansen,et al.  Post-Selection and Post-Regularization Inference in Linear Models with Many Controls and Instruments , 2015, 1501.03185.

[55]  George Kapetanios,et al.  Big Data Econometrics: Now Casting and Early Estimates , 2017 .

[56]  A. Bhattacharjee,et al.  Interpreting Big Data in the Macro Economy: A Bayesian Mixed Frequency Estimator , 2019 .

[57]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[58]  Klaus F. Zimmermann,et al.  Google Econometrics and Unemployment Forecasting , 2009 .

[59]  H. Akaike Fitting autoregressive models for prediction , 1969 .

[60]  A. Belloni,et al.  Least Squares After Model Selection in High-Dimensional Sparse Models , 2009, 1001.0188.

[61]  Paul Smith,et al.  Google's MIDAS Touch: Predicting UK Unemployment with Internet Search Data , 2015 .

[62]  Victor Chernozhukov,et al.  LASSO-Driven Inference in Time and Space , 2018, The Annals of Statistics.

[63]  I. Seidl,et al.  The socio-economic determinants of urban sprawl between 1980 and 2010 in Switzerland , 2017 .

[64]  Massimiliano Marcellino,et al.  Realtime nowcasting with a Bayesian mixed frequency model with stochastic volatility , 2012, Journal of the Royal Statistical Society. Series A,.

[65]  Nan-Jung Hsu,et al.  Subset selection for vector autoregressive processes using Lasso , 2008, Comput. Stat. Data Anal..

[66]  P. Bickel,et al.  SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR , 2008, 0801.1095.

[67]  Sendhil Mullainathan,et al.  Machine Learning: An Applied Econometric Approach , 2017, Journal of Economic Perspectives.

[68]  Eduardo F. Mendes,et al.  ℓ1-regularization of high-dimensional time-series models with non-Gaussian and heteroskedastic errors , 2016 .