Improved variable selection with Forward-Lasso adaptive shrinkage

Recently, considerable interest has focused on variable selection methods in regression situations where the number of predictors, $p$, is large relative to the number of observations, $n$. Two commonly applied variable selection approaches are the Lasso, which computes highly shrunk regression coefficients, and Forward Selection, which uses no shrinkage. We propose a new approach, "Forward-Lasso Adaptive SHrinkage" (FLASH), which includes the Lasso and Forward Selection as special cases, and can be used in both the linear regression and the Generalized Linear Model domains. As with the Lasso and Forward Selection, FLASH iteratively adds one variable to the model in a hierarchical fashion but, unlike these methods, at each step adjusts the level of shrinkage so as to optimize the selection of the next variable. We first present FLASH in the linear regression setting and show that it can be fitted using a variant of the computationally efficient LARS algorithm. Then, we extend FLASH to the GLM domain and demonstrate, through numerous simulations and real world data sets, as well as some theoretical analysis, that FLASH generally outperforms many competing approaches.

[1]  Mee Young Park,et al.  L1‐regularization path algorithm for generalized linear models , 2007 .

[2]  Gareth M. James,et al.  Variable Inclusion and Shrinkage Algorithms , 2008 .

[3]  Gareth M. James,et al.  A generalized Dantzig selector with shrinkage tuning , 2009 .

[4]  J. Friedman,et al.  A Statistical View of Some Chemometrics Regression Tools , 1993 .

[5]  Peng Zhao,et al.  On Model Selection Consistency of Lasso , 2006, J. Mach. Learn. Res..

[6]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[7]  R. Tibshirani,et al.  PATHWISE COORDINATE OPTIMIZATION , 2007, 0708.1485.

[8]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[9]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[10]  Wook-Yeon Hwang,et al.  First: Combining Forward Iterative Selection and Shrinkage in High Dimensional Sparse Linear Regression , 2022 .

[11]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[12]  Nicolai Meinshausen,et al.  Relaxed Lasso , 2007, Comput. Stat. Data Anal..

[13]  J. Ames,et al.  Variable Inclusion and Shrinkage Algorithms , 2008 .

[14]  Cun-Hui Zhang,et al.  Adaptive Lasso for sparse high-dimensional regression models , 2008 .

[15]  Martin J. Wainwright,et al.  Sharp Thresholds for High-Dimensional and Noisy Sparsity Recovery Using $\ell _{1}$ -Constrained Quadratic Programming (Lasso) , 2009, IEEE Transactions on Information Theory.

[16]  Terence Tao,et al.  The Dantzig selector: Statistical estimation when P is much larger than n , 2005, math/0506081.

[17]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[18]  Mee Young Park,et al.  L 1-regularization path algorithm for generalized linear models , 2006 .