IPAD: Stable Interpretable Forecasting with Knockoffs Inference

Interpretability and stability are two important features that are desired in many contemporary big data applications arising in economics and finance. While the former is enjoyed to some extent by many existing forecasting approaches, the latter in the sense of controlling the fraction of wrongly discovered features which can enhance greatly the interpretability is still largely underdeveloped in the econometric settings. To this end, in this paper we exploit the general framework of model-X knockoffs introduced recently in Cand\`{e}s, Fan, Janson and Lv (2018), which is nonconventional for reproducible large-scale inference in that the framework is completely free of the use of p-values for significance testing, and suggest a new method of intertwined probabilistic factors decoupling (IPAD) for stable interpretable forecasting with knockoffs inference in high-dimensional models. The recipe of the method is constructing the knockoff variables by assuming a latent factor model that is exploited widely in economics and finance for the association structure of covariates. Our method and work are distinct from the existing literature in that we estimate the covariate distribution from data instead of assuming that it is known when constructing the knockoff variables, our procedure does not require any sample splitting, we provide theoretical justifications on the asymptotic false discovery rate control, and the theory for the power analysis is also established. Several simulation examples and the real data analysis further demonstrate that the newly suggested method has appealing finite-sample performance with desired interpretability and stability compared to some popularly used forecasting methods.

[1]  R. Durrett Probability: Theory and Examples , 1993 .

[2]  C. Bonferroni Il calcolo delle assicurazioni su gruppi di teste , 1935 .

[3]  E. Candès,et al.  Controlling the false discovery rate via knockoffs , 2014, 1404.5609.

[4]  F. Diebold,et al.  Comparing Predictive Accuracy , 1994, Business Cycles.

[5]  Roman Vershynin,et al.  Introduction to the non-asymptotic analysis of random matrices , 2010, Compressed Sensing.

[6]  C. De Mol,et al.  Forecasting Using a Large Number of Predictors: Is Bayesian Regression a Valid Alternative to Principal Components? , 2006, SSRN Electronic Journal.

[7]  James M. Robins,et al.  Double/De-Biased Machine Learning of Global and Local Parameters Using Regularized Riesz Representers , 2018 .

[8]  Emmanuel J. Candes,et al.  Robust inference with knockoffs , 2018, The Annals of Statistics.

[9]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[10]  Gaorong Li,et al.  RANK: Large-Scale Inference With Graphical Nonlinear Knockoffs , 2017, Journal of the American Statistical Association.

[11]  Y. Benjamini,et al.  THE CONTROL OF THE FALSE DISCOVERY RATE IN MULTIPLE TESTING UNDER DEPENDENCY , 2001 .

[12]  Jeffrey S. Morris,et al.  Sure independence screening for ultrahigh dimensional feature space Discussion , 2008 .

[13]  J. Bai,et al.  Determining the Number of Factors in Approximate Factor Models , 2000 .

[14]  Seung C. Ahn,et al.  Eigenvalue Ratio Test for the Number of Factors , 2013 .

[15]  Frederi G. Viens,et al.  Some Applications of the Malliavin Calculus to Sub-Gaussian and Non-Sub-Gaussian Random Fields , 2007 .

[16]  J. Robins,et al.  Double/Debiased Machine Learning for Treatment and Structural Parameters , 2017 .

[17]  Ying Zhu,et al.  Inference in Approximately Sparse Correlated Random Effects Probit Models , 2017 .

[18]  J. Bai,et al.  Inferential Theory for Factor Models of Large Dimensions , 2003 .

[19]  George Kapetanios,et al.  A One-Covariate at a Time, Multiple Testing Approach to Variable Selection in High-Dimensional Linear Regression Models , 2016 .

[20]  Jianqing Fan,et al.  High Dimensional Classification Using Features Annealed Independence Rules. , 2007, Annals of statistics.

[21]  Yingying Fan,et al.  Asymptotic Equivalence of Regularization Methods in Thresholded Parameter Space , 2013, 1605.03310.

[22]  Yingying Fan,et al.  Nonuniformity of P-values Can Occur Early in Diverging Dimensions , 2017, J. Mach. Learn. Res..

[23]  Jinchi Lv,et al.  Impacts of high dimensionality in finite samples , 2013, 1311.2742.

[24]  J. Robins,et al.  Double/de-biased machine learning using regularized Riesz representers , 2018 .

[25]  Guang Cheng,et al.  Simultaneous Inference for High-Dimensional Linear Models , 2016, 1603.01295.

[26]  Rajen Dinesh Shah,et al.  Goodness‐of‐fit tests for high dimensional linear models , 2015, Journal of the Royal Statistical Society: Series B (Statistical Methodology).

[27]  Mark W. Watson,et al.  Generalized Shrinkage Methods for Forecasting Using Many Predictors , 2012 .

[28]  E. Candès,et al.  A knockoff filter for high-dimensional selective inference , 2016, The Annals of Statistics.

[29]  Joseph P. Romano,et al.  Exact and Approximate Stepdown Methods for Multiple Hypothesis Testing , 2003 .

[30]  Victor Chernozhukov,et al.  LASSO-Driven Inference in Time and Space , 2018, The Annals of Statistics.

[31]  Martin J. Wainwright,et al.  A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers , 2009, NIPS.

[32]  Y. Benjamini Discovering the false discovery rate , 2010 .

[33]  S. Holm A Simple Sequentially Rejective Multiple Test Procedure , 1979 .

[34]  Jianqing Fan,et al.  Journal of the American Statistical Association Estimating False Discovery Proportion under Arbitrary Covariance Dependence Estimating False Discovery Proportion under Arbitrary Covariance Dependence , 2022 .

[35]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[36]  G. Lynch,et al.  The Control of the False Discovery Rate in Fixed Sequence Multiple Testing , 2016, 1611.03146.

[37]  佐藤 保,et al.  Principal Components , 2021, Encyclopedic Dictionary of Archaeology.

[38]  Charles R. Johnson,et al.  Matrix Analysis, 2nd Ed , 2012 .

[39]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[40]  Lucas Janson,et al.  Panning for gold: ‘model‐X’ knockoffs for high dimensional controlled variable selection , 2016, 1610.02351.

[41]  Christian Hansen,et al.  High-dimensional econometrics and regularized GMM , 2018, 1806.01888.

[42]  F. Dias,et al.  Determining the number of factors in approximate factor models with global and group-specific factors , 2008 .

[43]  Benjamin Stucky,et al.  Asymptotic Confidence Regions for High-Dimensional Structured Sparsity , 2017, IEEE Transactions on Signal Processing.

[44]  Jianqing Fan,et al.  Sure independence screening for ultrahigh dimensional feature space , 2006, math/0612857.

[45]  Jianqing Fan,et al.  High-Dimensional Statistics , 2014 .