A group bridge approach for variable selection

Abstract In multiple regression problems when covariates can be naturally grouped, it is important to carry out feature selection at the group and within-group individual variable levels simultaneously. The existing methods, including the lasso and group lasso, are designed for either variable selection or group selection, but not for both. We propose a group bridge approach that is capable of simultaneous selection at both the group and within-group individual variable levels. The proposed approach is a penalized regularization method that uses a specially designed group bridge penalty. It has the oracle group selection property, in that it can correctly select important groups with probability converging to one. In contrast, the group lasso and group least angle regression methods in general do not possess such an oracle property in group selection. Simulation studies indicate that the group bridge has superior performance in group and individual variable selection relative to several existing methods.

[1]  H. Akaike,et al.  Information Theory and an Extension of the Maximum Likelihood Principle , 1973 .

[2]  C. L. Mallows Some comments on C_p , 1973 .

[3]  C. L. Mallows Some Comments onCp , 1973 .

[4]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[5]  G. Wahba Spline models for observational data , 1990 .

[6]  D. Pollard,et al.  Cube Root Asymptotics , 1990 .

[7]  T. Hastie,et al.  [A Statistical View of Some Chemometrics Regression Tools]: Discussion , 1993 .

[8]  J. Friedman,et al.  A Statistical View of Some Chemometrics Regression Tools , 1993 .

[9]  C. Mallows More comments on C p , 1995 .

[10]  Jon A. Wellner,et al.  Weak Convergence and Empirical Processes: With Applications to Statistics , 1996 .

[11]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[12]  Wenjiang J. Fu Penalized Regressions: The Bridge versus the Lasso , 1998 .

[13]  A. V. D. Vaart,et al.  Asymptotic Statistics: Frontmatter , 1998 .

[14]  Wenjiang J. Fu,et al.  Asymptotics for lasso-type estimators , 2000 .

[15]  A. E. Hoerl,et al.  Ridge regression: biased estimation for nonorthogonal problems , 2000 .

[16]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[17]  H. Zou,et al.  Regression Shrinkage and Selection via the Elastic Net , with Applications to Microarrays , 2003 .

[18]  Yuhong Yang Can the Strengths of AIC and BIC Be Shared , 2005 .

[19]  R. Forshee,et al.  Demographic and lifestyle factors associated with body mass index among children and adolescents , 2003, International journal of food sciences and nutrition.

[20]  Jianqing Fan,et al.  Nonconcave penalized likelihood with a diverging number of parameters , 2004, math/0406466.

[21]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[22]  D. Madigan,et al.  [Least Angle Regression]: Discussion , 2004 .

[23]  B. Turlach Discussion of "Least Angle Regression" by Efron, Hastie, Johnstone and Tibshirani , 2004 .

[24]  D. Hunter,et al.  Variable Selection using MM Algorithms. , 2005, Annals of statistics.

[25]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[26]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[27]  Hao Helen Zhang,et al.  Component selection and smoothing in multivariate nonparametric regression , 2006, math/0702659.

[28]  Hao Helen Zhang,et al.  Adaptive Lasso for Cox's proportional hazards model , 2007 .

[29]  P. Zhao,et al.  Grouped and Hierarchical Model Selection through Composite Absolute Penalties , 2007 .

[30]  Jian Huang,et al.  Clustering threshold gradient descent regularization: with applications to microarray studies , 2007, Bioinform..

[31]  J. Horowitz,et al.  Asymptotic properties of bridge estimators in sparse high-dimensional regression models , 2008, 0804.0693.