Theoretical Properties of the Overlapping Groups Lasso

We present two sets of theoretical results on the grouped lasso with overlap of Jacob, Obozinski and Vert (2009) in the linear regression setting. This method allows for joint selection of predictors in sparse regression, allowing for complex structured sparsity over the predictors encoded as a set of groups. This flexible framework suggests that arbitrarily complex structures can be encoded with an intricate set of groups. Our results show that this strategy results in unexpected theoretical consequences for the procedure. In particular, we give two sets of results: (1) finite sample bounds on prediction and estimation, and (2) asymptotic distribution and selection. Both sets of results give insight into the consequences of choosing an increasingly complex set of groups for the procedure, as well as what happens when the set of groups cannot recover the true sparsity pattern. Additionally, these results demonstrate the differences and similarities between the the grouped lasso procedure with and without overlapping groups. Our analysis shows the set of groups must be chosen with caution - an overly complex set of groups will damage the analysis.

[1]  Jon A. Wellner,et al.  Weak Convergence and Empirical Processes: With Applications to Statistics , 1996 .

[2]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[3]  Wenjiang J. Fu,et al.  Asymptotics for lasso-type estimators , 2000 .

[4]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[5]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[6]  A. Rinaldo,et al.  On the asymptotic properties of the group lasso estimator for linear models , 2008 .

[7]  Francis R. Bach,et al.  Consistency of the group Lasso and multiple kernel learning , 2007, J. Mach. Learn. Res..

[8]  C. Chesneau,et al.  Some theoretical results on the Grouped Variables Lasso , 2008 .

[9]  Francis R. Bach,et al.  Exploring Large Feature Spaces with Hierarchical Multiple Kernel Learning , 2008, NIPS.

[10]  Jean-Philippe Vert,et al.  Group lasso with overlap and graph lasso , 2009, ICML '09.

[11]  P. Zhao,et al.  The composite absolute penalties family for grouped and hierarchical variable selection , 2009, 0909.0411.

[12]  Massimiliano Pontil,et al.  Taking Advantage of Sparsity in Multi-Task Learning , 2009, COLT.

[13]  P. Bickel,et al.  SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR , 2008, 0801.1095.

[14]  Junzhou Huang,et al.  Learning with structured sparsity , 2009, ICML '09.

[15]  Junzhou Huang,et al.  The Benefit of Group Sparsity , 2009 .

[16]  Francis R. Bach,et al.  Structured sparsity-inducing norms through submodular functions , 2010, NIPS.

[17]  Ji Zhu,et al.  Regularized Multivariate Regression for Identifying Master Predictors with Application to Integrative Genomics Study of Breast Cancer. , 2008, The annals of applied statistics.

[18]  Eric P. Xing,et al.  Tree-Guided Group Lasso for Multi-Task Regression with Structured Sparsity , 2009, ICML.

[19]  Francis R. Bach,et al.  Structured Sparse Principal Component Analysis , 2009, AISTATS.

[20]  Francis R. Bach,et al.  Structured Variable Selection with Sparsity-Inducing Norms , 2009, J. Mach. Learn. Res..

[21]  Francis Bach,et al.  Shaping Level Sets with Submodular Functions , 2010, NIPS.

[22]  Larry Wasserman,et al.  STRUCTURED, SPARSE REGRESSION WITH APPLICATION TO HIV DRUG RESISTANCE. , 2010, The annals of applied statistics.