Economic Predictions with Big Data: The Illusion of Sparsity

We compare sparse and dense representations of predictive models in macroeconomics, microeconomics and finance. To deal with a large number of possible predictors, we specify a "spike-and-slab" prior that allows for both variable selection and shrinkage. The posterior distribution does not typically concentrate on a single sparse or dense model but on a wide set of models. A clearer pattern of sparsity can only emerge when models of very low dimension are strongly favored a priori.

[1]  Qing Li,et al.  The Bayesian elastic net , 2010 .

[2]  Richard F. Gunst,et al.  Applied Regression Analysis , 1999, Technometrics.

[3]  B. M. Pötscher,et al.  MODEL SELECTION AND INFERENCE: FACTS AND FICTION , 2005, Econometric Theory.

[4]  Victor Chernozhukov,et al.  Inference on Treatment Effects after Selection Amongst High-Dimensional Controls , 2011 .

[5]  C. De Mol,et al.  Forecasting Using a Large Number of Predictors: Is Bayesian Regression a Valid Alternative to Principal Components? , 2006, SSRN Electronic Journal.

[6]  J. Stock,et al.  Macroeconomic Forecasting Using Diffusion Indexes , 2002 .

[7]  B. M. Pötscher,et al.  CAN ONE ESTIMATE THE UNCONDITIONAL DISTRIBUTION OF POST-MODEL-SELECTION ESTIMATORS? , 2007, Econometric Theory.

[8]  A. Belloni,et al.  SPARSE MODELS AND METHODS FOR OPTIMAL INSTRUMENTS WITH AN APPLICATION TO EMINENT DOMAIN , 2012 .

[9]  Daniel L. Chen,et al.  Growth Under the Shadow of Expropriation? The Economic Impacts of Eminent Domain , 2016 .

[10]  A. V. D. Vaart,et al.  BAYESIAN LINEAR REGRESSION WITH SPARSE PRIORS , 2014, 1403.0735.

[11]  George Kapetanios,et al.  A One-Covariate at a Time, Multiple Testing Approach to Variable Selection in High-Dimensional Linear Regression Models , 2016 .

[12]  A Tikhonov,et al.  Solution of Incorrectly Formulated Problems and the Regularization Method , 1963 .

[13]  Wm. R. Wright General Intelligence, Objectively Determined and Measured. , 1905 .

[14]  Aman Ullah,et al.  Robustify Financial Time Series Forecasting with Bagging , 2014 .

[15]  Abdelhak Senhadji,et al.  Sources of Economic Growth: An Extensive Growth Accounting Exercise , 1999, SSRN Electronic Journal.

[16]  Doron Avramov,et al.  Stock Return Predictability and Model Uncertainty , 2001 .

[17]  Susan Athey,et al.  Ensemble Methods for Causal Effects in Panel Data Settings , 2019, AEA Papers and Proceedings.

[18]  Yuan Liao,et al.  A Lava Attack on the Recovery of Sums of Dense and Sparse Signals , 2015, ArXiv.

[19]  Damian Kozbur Analysis of Testing-Based Forward Model Selection , 2015 .

[20]  J. S. Rao,et al.  Spike and slab variable selection: Frequentist and Bayesian strategies , 2005, math/0505633.

[21]  Karl Pearson F.R.S. LIII. On lines and planes of closest fit to systems of points in space , 1901 .

[22]  M. Kendall,et al.  The discarding of variables in multivariate analysis. , 1967, Biometrika.

[23]  A. Belloni,et al.  Square-Root Lasso: Pivotal Recovery of Sparse Signals via Conic Programming , 2010, 1009.5689.

[24]  H. Leeb,et al.  Sparse Estimators and the Oracle Property, or the Return of Hodges' Estimator , 2007, 0704.1466.

[25]  G. Casella,et al.  The Bayesian Lasso , 2008 .

[26]  Arthur E. Hoerl,et al.  Ridge Regression: Biased Estimation for Nonorthogonal Problems , 2000, Technometrics.

[27]  C. Granger,et al.  Handbook of Economic Forecasting , 2006 .

[28]  T. J. Mitchell,et al.  Bayesian Variable Selection in Linear Regression , 1988 .

[29]  Serena Ng,et al.  Boosting diffusion indices , 2009 .

[30]  I. Welch,et al.  A Comprehensive Look at the Empirical Performance of Equity Premium Prediction II , 2004, SSRN Electronic Journal.

[31]  James G. Scott,et al.  The horseshoe estimator for sparse signals , 2010 .

[32]  R. R. Hocking,et al.  Selection of the Best Subset in Regression Analysis , 1967 .

[33]  Christian Hansen,et al.  Valid Post-Selection and Post-Regularization Inference: An Elementary, General Approach , 2015 .

[34]  J. Donohue,et al.  The Impact of Legalized Abortion on Crime , 2001 .

[35]  M. Steel,et al.  Model uncertainty in cross-country growth regressions , 2001 .

[36]  A. Basilevsky,et al.  Factor Analysis as a Statistical Method. , 1964 .

[37]  Alberto Abadie,et al.  Choosing among Regularized Estimators in Empirical Economics: The Risk of Machine Learning , 2019, Review of Economics and Statistics.

[38]  Charles Soussen,et al.  From Bernoulli–Gaussian Deconvolution to Sparse Signal Restoration , 2011, IEEE Transactions on Signal Processing.

[39]  Stefan Wager,et al.  Estimation and Inference of Heterogeneous Treatment Effects using Random Forests , 2015, Journal of the American Statistical Association.

[40]  E. George,et al.  APPROACHES FOR BAYESIAN VARIABLE SELECTION , 1997 .

[41]  Hal R. Varian,et al.  Big Data: New Tricks for Econometrics , 2014 .

[42]  A. Belloni,et al.  Inference for High-Dimensional Sparse Econometric Models , 2011, 1201.0220.

[43]  Campbell R. Harvey,et al.  . . . And the Cross-Section of Expected Returns , 2014 .

[44]  Serena Ng,et al.  Working Paper Series , 2019 .

[45]  Edward Leamer Multicollinearity: A Bayesian Interpretation , 1973 .

[46]  E. George,et al.  Journal of the American Statistical Association is currently published by American Statistical Association. , 2007 .

[47]  Gm Gero Walter,et al.  Bayesian linear regression , 2009 .

[48]  P. Bickel,et al.  SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR , 2008, 0801.1095.

[49]  Jonathan H. Wright,et al.  Credit Spreads as Predictors of Real-Time Economic Activity: A Bayesian Model-Averaging Approach , 2011, Review of Economics and Statistics.

[50]  Lei Sun,et al.  Bayesian l 0 ‐regularized least squares , 2017, Applied Stochastic Models in Business and Industry.

[51]  L. Kilian,et al.  How Useful Is Bagging in Forecasting Economic Time Series? A Case Study of U.S. Consumer Price Inflation , 2008 .

[52]  C McCollin Applied stochastic models in business and industry , 2011 .

[53]  Domenico Giannone,et al.  Conditional Forecasts and Scenario Analysis with Vector Autoregressions for Large Cross-Sections , 2014, SSRN Electronic Journal.

[54]  J. Stock,et al.  Forecasting Using Principal Components From a Large Number of Predictors , 2002 .

[55]  Sara van de Geer,et al.  Statistics for High-Dimensional Data: Methods, Theory and Applications , 2011 .

[56]  A. Belloni,et al.  Least Squares After Model Selection in High-Dimensional Sparse Models , 2009, 1001.0188.

[57]  David E. Rapach,et al.  Bagging or Combining (or Both)? An Analysis Based on Forecasting U.S. Employment Growth , 2010 .

[58]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[59]  Joachim Freyberger,et al.  Dissecting Characteristics Nonparametrically , 2017, The Review of Financial Studies.

[60]  M. Clyde,et al.  Mixtures of g Priors for Bayesian Variable Selection , 2008 .

[61]  X. Sala-i-Martin,et al.  Determinants of Long-Term Growth: A Bayesian Averaging of Classical Estimates (Bace) Approach , 2000 .

[62]  K. J. Martijn Cremers,et al.  Stock Return Predictability: A Bayesian Model Selection Perspective , 2000 .

[63]  Serhiy Kozak,et al.  Shrinking the Cross Section , 2017, Journal of Financial Economics.

[64]  Dean Phillips Foster,et al.  Calibration and Empirical Bayes Variable Selection , 1997 .