A polynomial algorithm for best-subset selection problem

Significance Best-subset selection is a benchmark optimization problem in statistics and machine learning. Although many optimization strategies and algorithms have been proposed to solve this problem, our splicing algorithm, under reasonable conditions, enjoys the following properties simultaneously with high probability: 1) its computational complexity is polynomial; 2) it can recover the true subset; and 3) its solution is globally optimal. Best-subset selection aims to find a small subset of predictors, so that the resulting linear model is expected to have the most desirable prediction accuracy. It is not only important and imperative in regression analysis but also has far-reaching applications in every facet of research, including computer science and medicine. We introduce a polynomial algorithm, which, under mild conditions, solves the problem. This algorithm exploits the idea of sequencing and splicing to reach a stable solution in finite steps when the sparsity level of the model is fixed but unknown. We define an information criterion that helps the algorithm select the true sparsity level with a high probability. We show that when the algorithm produces a stable optimal solution, that solution is the oracle estimator of the true parameters with probability one. We also demonstrate the power of the algorithm in several numerical studies.

[1]  Alan J. Miller Subset Selection in Regression , 1992 .

[2]  Emmanuel J. Candès,et al.  Decoding by linear programming , 2005, IEEE Transactions on Information Theory.

[3]  Jian Huang,et al.  COORDINATE DESCENT ALGORITHMS FOR NONCONVEX PENALIZED REGRESSION, WITH APPLICATIONS TO BIOLOGICAL FEATURE SELECTION. , 2011, The annals of applied statistics.

[4]  Mike E. Davies,et al.  Iterative Hard Thresholding for Compressed Sensing , 2008, ArXiv.

[5]  Kazufumi Ito,et al.  The Primal-Dual Active Set Strategy as a Semismooth Newton Method , 2002, SIAM J. Optim..

[6]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[7]  HuangJian,et al.  A constructive approach to L0 penalized regression , 2018 .

[8]  Cun-Hui Zhang Nearly unbiased variable selection under minimax concave penalty , 2010, 1002.4734.

[9]  P. Bickel,et al.  SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR , 2008, 0801.1095.

[10]  Martin J. Wainwright,et al.  Restricted Eigenvalue Properties for Correlated Gaussian Designs , 2010, J. Mach. Learn. Res..

[11]  Runze Li,et al.  CALIBRATING NON-CONVEX PENALIZED REGRESSION IN ULTRA-HIGH DIMENSION. , 2013, Annals of statistics.

[12]  H. Akaike,et al.  Information Theory and an Extension of the Maximum Likelihood Principle , 1973 .

[13]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[14]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[15]  Cun-Hui Zhang,et al.  The sparsity and bias of the Lasso selection in high-dimensional linear regression , 2008, 0808.0967.

[16]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[17]  Jiahua Chen,et al.  Extended Bayesian information criteria for model selection with large model spaces , 2008 .

[18]  C. Mallows Some Comments on Cp , 2000, Technometrics.

[19]  D. Bertsimas,et al.  Best Subset Selection via a Modern Optimization Lens , 2015, 1507.03133.

[20]  Hussein Hazimeh,et al.  Fast Best Subset Selection: Coordinate Descent and Local Combinatorial Optimization Algorithms , 2018, Oper. Res..

[21]  Runze Li,et al.  Regularization Parameter Selections via Generalized Information Criterion , 2010, Journal of the American Statistical Association.

[22]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[23]  Jian Huang,et al.  A Constructive Approach to $L_0$ Penalized Regression , 2018, J. Mach. Learn. Res..

[24]  Balas K. Natarajan,et al.  Sparse Approximate Solutions to Linear Systems , 1995, SIAM J. Comput..

[25]  Yan Liu,et al.  Scalable Interpretable Multi-Response Regression via SEED , 2016, J. Mach. Learn. Res..