Consistent group selection in high-dimensional linear regression.

In regression problems where covariates can be naturally grouped, the group Lasso is an attractive method for variable selection since it respects the grouping structure in the data. We study the selection and estimation properties of the group Lasso in high-dimensional settings when the number of groups exceeds the sample size. We provide sufficient conditions under which the group Lasso selects a model whose dimension is comparable with the underlying model with high probability and is estimation consistent. However, the group Lasso is, in general, not selection consistent and also tends to select groups that are not important in the model. To improve the selection results, we propose an adaptive group Lasso method which is a generalization of the adaptive Lasso and requires an initial estimator. We show that the adaptive group Lasso is consistent in group selection under certain conditions if the group Lasso is used as the initial estimator.

[1]  Cun-Hui Zhang,et al.  A group bridge approach for variable selection , 2009, Biometrika.

[2]  N. Meinshausen,et al.  LASSO-TYPE RECOVERY OF SPARSE REPRESENTATIONS FOR HIGH-DIMENSIONAL DATA , 2008, 0806.0145.

[3]  Cun-Hui Zhang,et al.  Adaptive Lasso for sparse high-dimensional regression models , 2008 .

[4]  Peter Buhlmann,et al.  Discussion: One-step sparse estimates in nonconcave penalized likelihood models , 2008, 0808.1013.

[5]  Cun-Hui Zhang,et al.  The sparsity and bias of the Lasso selection in high-dimensional linear regression , 2008, 0808.0967.

[6]  H. Zou,et al.  One-step Sparse Estimates in Nonconcave Penalized Likelihood Models. , 2008, Annals of statistics.

[7]  Cun-Hui Zhang Discussion: One-step sparse estimates in nonconcave penalized likelihood models , 2008, 0808.1025.

[8]  J. Horowitz,et al.  Asymptotic properties of bridge estimators in sparse high-dimensional regression models , 2008, 0804.0693.

[9]  S. Geer HIGH-DIMENSIONAL GENERALIZED LINEAR MODELS AND THE LASSO , 2008, 0804.0703.

[10]  E. Candès,et al.  The Dantzig selector: Statistical estimation when P is much larger than n , 2005, math/0506081.

[11]  Cun-Hui Zhang PENALIZED LINEAR UNBIASED SELECTION , 2007 .

[12]  P. Zhao,et al.  Grouped and Hierarchical Model Selection through Composite Absolute Penalties , 2007 .

[13]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[14]  Peng Zhao,et al.  On Model Selection Consistency of Lasso , 2006, J. Mach. Learn. Res..

[15]  N. Meinshausen,et al.  High-dimensional graphs and variable selection with the Lasso , 2006, math/0608017.

[16]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[17]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[18]  Y. Ritov,et al.  Persistence in high-dimensional linear predictor selection and the virtue of overparametrization , 2004 .

[19]  Jianqing Fan,et al.  Nonconcave penalized likelihood with a diverging number of parameters , 2004, math/0406466.

[20]  D. Madigan,et al.  [Least Angle Regression]: Discussion , 2004 .

[21]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[22]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[23]  Jianqing Fan,et al.  Regularization of Wavelet Approximations , 2001 .

[24]  Wenjiang J. Fu,et al.  Asymptotics for lasso-type estimators , 2000 .

[25]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[26]  G. Schwarz Estimating the Dimension of a Model , 1978 .