Exclusive Group Lasso for Structured Variable Selection

A structured variable selection problem is considered in which the covariates, divided into predefined groups, activate according to sparse patterns with few nonzero entries per group. Capitalizing on the concept of atomic norm, a composite norm can be properly designed to promote such exclusive group sparsity patterns. The resulting norm lends itself to efficient and flexible regularized optimization algorithms for support recovery, like the proximal algorithm. Moreover, an active set algorithm is proposed that builds the solution by successively including structure atoms into the estimated support. It is also shown that such an algorithm can be tailored to match more rigid structures than plain exclusive group sparsity. Asymptotic consistency analysis (with both the number of parameters as well as the number of groups growing with the observation size) establishes the effectiveness of the proposed solution in terms of signed support recovery under conventional assumptions. Finally, a set of numerical simulations further corroborates the results.

[1]  Defeng Sun,et al.  A dual Newton based preconditioned proximal point algorithm for exclusive lasso models , 2019 .

[2]  Ilker Bayram,et al.  A Penalty Function Promoting Sparsity Within and Across Groups , 2016, IEEE Transactions on Signal Processing.

[3]  Robert D. Nowak,et al.  Classification With the Sparse Group Lasso , 2016, IEEE Transactions on Signal Processing.

[4]  Julien Mairal,et al.  Proximal Methods for Hierarchical Sparse Coding , 2010, J. Mach. Learn. Res..

[5]  Junzhou Huang,et al.  Learning with structured sparsity , 2009, ICML '09.

[6]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[7]  J WainwrightMartin Sharp thresholds for high-dimensional and noisy sparsity recovery using l1-constrained quadratic programming (Lasso) , 2009 .

[8]  Qiang Chen,et al.  Multi-label visual classification with label exclusive context , 2011, 2011 International Conference on Computer Vision.

[9]  A. Banerjee Convex Analysis and Optimization , 2006 .

[10]  Genevera I. Allen,et al.  Within Group Variable Selection through the Exclusive Lasso , 2015, 1505.07517.

[11]  Patrick L. Combettes,et al.  Proximal Splitting Methods in Signal Processing , 2009, Fixed-Point Algorithms for Inverse Problems in Science and Engineering.

[12]  Rong Jin,et al.  Exclusive Lasso for Multi-task Feature Selection , 2010, AISTATS.

[13]  Junzhou Huang,et al.  The Benefit of Group Sparsity , 2009 .

[14]  Chen Li,et al.  Compressed sensing with local structure: uniform recovery guarantees for the sparsity in levels class , 2016, Applied and Computational Harmonic Analysis.

[15]  Y. Nesterov Gradient methods for minimizing composite objective function , 2007 .

[16]  Feiping Nie,et al.  Exclusive Feature Learning on Arbitrary Structures via \ell_{1, 2}-norm , 2014, NIPS.

[17]  Francis R. Bach,et al.  Structured Variable Selection with Sparsity-Inducing Norms , 2009, J. Mach. Learn. Res..

[18]  E. Candès,et al.  Stable signal recovery from incomplete and inaccurate measurements , 2005, math/0503066.

[19]  Pablo A. Parrilo,et al.  The Convex Geometry of Linear Inverse Problems , 2010, Foundations of Computational Mathematics.

[20]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[21]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[22]  Holger Rauhut,et al.  Refined Analysis of Sparse MIMO Radar , 2015, ArXiv.

[23]  Changsheng Xu,et al.  Robust Visual Tracking via Exclusive Context Modeling , 2016, IEEE Transactions on Cybernetics.

[24]  Christos Thrampoulidis,et al.  The squared-error of generalized LASSO: A precise analysis , 2013, 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[25]  Lorenzo Rosasco,et al.  Proximal methods for the latent group lasso penalty , 2012, Computational Optimization and Applications.

[26]  S. Frick,et al.  Compressed Sensing , 2014, Computer Vision, A Reference Guide.

[27]  J. Moreau Fonctions convexes duales et points proximaux dans un espace hilbertien , 1962 .

[28]  Julien Mairal,et al.  Optimization with Sparsity-Inducing Penalties , 2011, Found. Trends Mach. Learn..

[29]  Francis R. Bach,et al.  Learning the Structure for Structured Sparsity , 2014, IEEE Transactions on Signal Processing.

[30]  Anders C. Hansen,et al.  On the Absence of Uniform Recovery in Many Real-World Applications of Compressed Sensing and the Restricted Isometry Property and Nullspace Property in Levels , 2017, SIAM J. Imaging Sci..

[31]  Ben Adcock,et al.  Compressed Sensing and Parallel Acquisition , 2016, IEEE Transactions on Information Theory.

[32]  Bing Cai Kok,et al.  Sparse Extended Redundancy Analysis: Variable Selection via the Exclusive LASSO , 2019, Multivariate behavioral research.

[33]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[34]  Y. Nesterov A method for unconstrained convex minimization problem with the rate of convergence o(1/k^2) , 1983 .

[35]  M. Kowalski Sparse regression using mixed norms , 2009 .

[36]  Ben Adcock,et al.  BREAKING THE COHERENCE BARRIER: A NEW THEORY FOR COMPRESSED SENSING , 2013, Forum of Mathematics, Sigma.