Forward-LASSO with Adaptive Shrinkage

Recently, considerable interest has focussed on variable selection methods in regression situations where the number of predictors, p, is large relative to the number of observations, n. Two commonly applied variable selection approaches are the Lasso, which computes highly shrunk regression coefficients, and Forward Selection, which uses no shrinkage. We propose a new approach, “Forward-Lasso Adaptive SHrinkage” (FLASH), which includes the Lasso and Forward Selection as special cases, and can be used in both the linear regression and the Generalized Linear Model domains. As with the Lasso and Forward Selection, FLASH iteratively adds one variable to the model in a hierarchical fashion but, unlike these methods, at each step adjusts the level of shrinkage so as to optimize the selection of the next variable. We first present FLASH in the linear regression setting and show that it can be fitted using a variant of the computationally efficient LARS algorithm. Then, we extend FLASH to the GLM domain and demonstrate, through numerous simulations and real world data sets, as well as some theoretical analysis, that FLASH generally outperforms many competing approaches. Some key words: Forward Selection; Lasso; Shrinkage; Variable Selection

[1]  Terence Tao,et al.  The Dantzig selector: Statistical estimation when P is much larger than n , 2005, math/0506081.

[2]  Gareth M. James,et al.  A generalized Dantzig selector with shrinkage tuning , 2009 .

[3]  J. Ames,et al.  Variable Inclusion and Shrinkage Algorithms , 2008 .

[4]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[5]  Wook-Yeon Hwang,et al.  First: Combining Forward Iterative Selection and Shrinkage in High Dimensional Sparse Linear Regression , 2022 .

[6]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[7]  N. Meinshausen,et al.  LASSO-TYPE RECOVERY OF SPARSE REPRESENTATIONS FOR HIGH-DIMENSIONAL DATA , 2008, 0806.0145.

[8]  Cun-Hui Zhang,et al.  Adaptive Lasso for sparse high-dimensional regression models , 2008 .

[9]  Mee Young Park,et al.  L1‐regularization path algorithm for generalized linear models , 2007 .

[10]  R. Tibshirani,et al.  PATHWISE COORDINATE OPTIMIZATION , 2007, 0708.1485.

[11]  J WainwrightMartin Sharp thresholds for high-dimensional and noisy sparsity recovery using l1-constrained quadratic programming (Lasso) , 2009 .

[12]  Peng Zhao,et al.  On Model Selection Consistency of Lasso , 2006, J. Mach. Learn. Res..

[13]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[14]  N. Meinshausen,et al.  High-dimensional graphs and variable selection with the Lasso , 2006, math/0608017.

[15]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[16]  Martin J. Wainwright,et al.  Sharp thresholds for high-dimensional and noisy recovery of sparsity , 2006, ArXiv.

[17]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[18]  Nicolai Meinshausen,et al.  Relaxed Lasso , 2007, Comput. Stat. Data Anal..

[19]  J. Friedman,et al.  A Statistical View of Some Chemometrics Regression Tools , 1993 .