High-dimensional variable selection with heterogeneous signals: A precise asymptotic perspective

We study the problem of exact support recovery for high-dimensional sparse linear regression when the signals are weak, rare and possibly heterogeneous. Specifically, we fix the minimum signal magnitude at the information-theoretic optimal rate and investigate the asymptotic selection accuracy of best subset selection (BSS) and marginal screening (MS) procedures under independent Gaussian design. Despite of the ideal setup, somewhat surprisingly, marginal screening can fail to achieve exact recovery with probability converging to one in the presence of heterogeneous signals, whereas BSS enjoys model consistency whenever the minimum signal strength is above the information-theoretic threshold. To mitigate the computational issue of BSS, we also propose a surrogate two-stage algorithm called ETS (Estimate Then Screen) based on iterative hard thresholding and gradient coordinate screening, and we show that ETS shares exactly the same asymptotic optimality in terms of exact recovery as BSS. Finally, we present a simulation study comparing ETS with LASSO and marginal screening. The numerical results echo with our asymptotic theory even for realistic values of the sample size, dimension and sparsity.

[1]  Martin J. Wainwright,et al.  Fast global convergence of gradient methods for high-dimensional statistical recovery , 2011, ArXiv.

[2]  Hyokyoung G Hong,et al.  Weak signals in high-dimension regression: detection, estimation and prediction. , 2019, Applied stochastic models in business and industry.

[3]  W Leisenring,et al.  A marginal regression modelling framework for evaluating medical diagnostic tests. , 1997, Statistics in medicine.

[4]  J. Robins,et al.  Uniform consistency in causal inference , 2003 .

[5]  Yao Wang,et al.  Model selection and estimation in high dimensional regression models with group SCAD , 2015 .

[6]  Kengo Kato,et al.  Gaussian approximation of suprema of empirical processes , 2012, 1212.6885.

[7]  Sundeep Rangan,et al.  Necessary and Sufficient Conditions for Sparsity Pattern Recovery , 2008, IEEE Transactions on Information Theory.

[8]  Shihao Wu,et al.  On the early solution path of best subset selection , 2021 .

[9]  Yi Li,et al.  Conditional screening for ultra-high dimensional covariates with survival outcomes , 2016, Lifetime data analysis.

[10]  S. Geer,et al.  On the conditions used to prove oracle results for the Lasso , 2009, 0910.0722.

[11]  Martin J. Wainwright,et al.  Lower bounds on the performance of polynomial-time algorithms for sparse linear regression , 2014, COLT.

[12]  Wenbin Lu Marginal Regression of Multivariate Event Times Based on Linear Transformation Models , 2005, Lifetime data analysis.

[13]  Qi Zhang,et al.  Optimality of graphlet screening in high dimensional variable selection , 2012, J. Mach. Learn. Res..

[14]  Runze Li,et al.  Feature Screening for Ultrahigh Dimensional Categorical Data With Applications , 2013, Journal of business & economic statistics : a publication of the American Statistical Association.

[15]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[16]  Jianqing Fan,et al.  Sure independence screening for ultrahigh dimensional feature space , 2006, math/0612857.

[17]  Jianqing Fan,et al.  ARE DISCOVERIES SPURIOUS? DISTRIBUTIONS OF MAXIMUM SPURIOUS CORRELATIONS AND THEIR APPLICATIONS. , 2015, Annals of statistics.

[18]  Martin J. Wainwright,et al.  Information-theoretic limits on sparsity recovery in the high-dimensional and noisy setting , 2009, IEEE Trans. Inf. Theory.

[19]  L. Wasserman,et al.  HIGH DIMENSIONAL VARIABLE SELECTION. , 2007, Annals of statistics.

[20]  P. Bickel,et al.  SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR , 2008, 0801.1095.

[21]  Larry A. Wasserman,et al.  A Comparison of the Lasso and Marginal Regression , 2012, J. Mach. Learn. Res..

[22]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[23]  Mike E. Davies,et al.  Iterative Hard Thresholding for Compressed Sensing , 2008, ArXiv.

[24]  E. Candès,et al.  Global testing under sparse alternatives: ANOVA, multiple comparisons and the higher criticism , 2010, 1007.1434.

[25]  Bart P. G. Van Parys,et al.  Sparse high-dimensional regression: Exact scalable algorithms and phase transitions , 2017, The Annals of Statistics.

[26]  Haoyang Liu,et al.  Between hard and soft thresholding: optimal iterative thresholding algorithms , 2018, Information and Inference: A Journal of the IMA.

[27]  Cun-Hui Zhang Nearly unbiased variable selection under minimax concave penalty , 2010, 1002.4734.

[28]  Martin J. Wainwright,et al.  Information-Theoretic Limits on Sparse Signal Recovery: Dense versus Sparse Measurement Matrices , 2008, IEEE Transactions on Information Theory.

[29]  Min Chen,et al.  Asset selection based on high frequency Sharpe ratio , 2020 .

[30]  Mario Bertero,et al.  The Stability of Inverse Problems , 1980 .

[31]  Jin Zhu,et al.  A polynomial algorithm for best-subset selection problem , 2020, Proceedings of the National Academy of Sciences.

[32]  Peng Zhao,et al.  On Model Selection Consistency of Lasso , 2006, J. Mach. Learn. Res..

[33]  Tzu-Jung Huang,et al.  Marginal screening for high-dimensional predictors of survival outcomes. , 2019, Statistica Sinica.

[34]  Jukka Corander,et al.  Genome-wide association studies with high-dimensional phenotypes , 2012, Statistical applications in genetics and molecular biology.

[35]  Prateek Jain,et al.  On Iterative Hard Thresholding Methods for High-dimensional M-Estimation , 2014, NIPS.

[36]  Alexandre B. Tsybakov,et al.  Optimal Variable Selection and Adaptive Noisy Compressed Sensing , 2018, IEEE Transactions on Information Theory.

[37]  Runze Li,et al.  Prioritizing genetic variants in GWAS with lasso using permutation-assisted tuning , 2020, Bioinform..

[38]  Simon C. Potter,et al.  Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls , 2007, Nature.

[39]  Jiashun Jin,et al.  UPS delivers optimal phase diagram in high-dimensional variable selection , 2010, 1010.5028.

[40]  D. Bertsimas,et al.  Best Subset Selection via a Modern Optimization Lens , 2015, 1507.03133.

[41]  Jianqing Fan,et al.  When is best subset selection the "best"? , 2020 .