BOOTSTRAP-BASED PENALTY CHOICE FOR THE LASSO , ACHIEVING ORACLE PERFORMANCE

In theory, if penalty parameters are chosen appropriately then the lasso can eliminate unnecessary variables in prediction problems, and improve the performance of predictors based on the variables that remain. However, standard methods for tuning-parameter choice, for example techniques based on the bootstrap or cross-validation, are not sufficiently accurate to achieve this level of precision. Until Zou’s (2006) proposal for an inversely-weighted lasso, this difficulty led to speculation that it might not be possible to achieve oracle performance using the lasso. In the present paper we show that a straightforward application of the m-out-of-n bootstrap produces adaptive penalty estimates that confer oracle properties on the lasso. The application is of interest in its own right since, unlike many uses of the m-out-of-n bootstrap, it is not designed to estimate a non-normal distribution; the limiting distributions of regression parameter estimators are normal. Instead, the m-out-of-n bootstrap overcomes the tendency of the standard bootstrap to confound the errors committed in determining whether or not a parameter value is zero, with estimation errors for nonzero parameters.

[1]  Wenjiang J. Fu,et al.  Asymptotics for lasso-type estimators , 2000 .

[2]  Joel A. Tropp,et al.  Recovery of short, complex linear combinations via /spl lscr//sub 1/ minimization , 2005, IEEE Transactions on Information Theory.

[3]  Stephen J. Wright,et al.  Numerical Optimization , 2018, Fundamental Statistical Inference.

[4]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[5]  L. Breiman Better subset regression using the nonnegative garrote , 1995 .

[6]  V. V. Petrov Sums of Independent Random Variables , 1975 .

[7]  Michael C. Ferris,et al.  Model building with likelihood basis pursuit , 2004, Optim. Methods Softw..

[8]  Michael A. Saunders,et al.  Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..

[9]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[10]  D. Donoho For most large underdetermined systems of equations, the minimal 𝓁1‐norm near‐solution approximates the sparsest near‐solution , 2006 .

[11]  Xiaoming Huo,et al.  Uncertainty principles and ideal atomic decomposition , 2001, IEEE Trans. Inf. Theory.

[12]  Hong-Ye Gao,et al.  Wavelet Shrinkage Denoising Using the Non-Negative Garrote , 1998 .

[13]  Michael Elad,et al.  Optimally sparse representation in general (nonorthogonal) dictionaries via ℓ1 minimization , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[14]  N. Meinshausen,et al.  High-dimensional graphs and variable selection with the Lasso , 2006, math/0608017.

[15]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[16]  D. Donoho For most large underdetermined systems of linear equations the minimal 𝓁1‐norm solution is also the sparsest solution , 2006 .