A fast algorithm for group square-root Lasso based group-sparse regression

Abstract Group square-root Lasso (GSRL) is a promising tool for group-sparse regression since the hyperparameter is independent of noise level. Recent works also reveal its connections to some statistically sound and hyperparameter-free methods, e.g., group-sparse iterative covariance-based estimation (GSPICE). However, the non-smoothness of the data-fitting term leads to the difficulty in solving the optimization problem of GSRL, and available solvers usually suffer either slow convergence or restrictions on the dictionary. In this paper, we propose a class of efficient solvers for GSRL in a block coordinate descent manner, including group-wise cyclic minimization (GCM) for group-wise orthonormal dictionary and generalized GCM (G-GCM) for general dictionary. Both strict descent property and global convergence are proved. To cope with signal processing applications, the complex-valued multiple measurement vectors (MMV) case is considered. The proposed algorithm can also be used for the fast implementation of methods with theoretical equivalence to GSRL, e.g., GSPICE. Significant superiority in computational efficiency is verified by simulation results.

[1]  S. Jaffard,et al.  New Trends in Applied Harmonic Analysis: Sparse Representations, Compressed Sensing, and Multifractal Analysis , 2016 .

[2]  Petre Stoica,et al.  Sparse Estimation of Spectral Lines: Grid Selection Problems and Their Solutions , 2012, IEEE Transactions on Signal Processing.

[3]  Håkan Hjalmarsson,et al.  A Note on the SPICE Method , 2012, IEEE Transactions on Signal Processing.

[4]  Alexandre Gramfort,et al.  Support recovery and sup-norm convergence rates for sparse pivotal estimation , 2020, AISTATS.

[5]  Michael Elad,et al.  Sparse and Redundant Representations - From Theory to Applications in Signal and Image Processing , 2010 .

[6]  Zhangyang Wang,et al.  Deep Learning through Sparse and Low-Rank Modeling , 2019 .

[7]  Jian Li,et al.  RFI Mitigation for UWB Radar Via Hyperparameter-Free Sparse SPICE Methods , 2019, IEEE Transactions on Geoscience and Remote Sensing.

[8]  Junzhou Huang,et al.  The Benefit of Group Sparsity , 2009 .

[9]  Shuiwang Ji,et al.  SLEP: Sparse Learning with Efficient Projections , 2011 .

[10]  Antonin Chambolle,et al.  A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging , 2011, Journal of Mathematical Imaging and Vision.

[11]  Sylvain Arlot,et al.  A survey of cross-validation procedures for model selection , 2009, 0907.4728.

[12]  Jian Li,et al.  Weighted SPICE: A unifying approach for hyperparameter-free sparse estimation , 2014, Digit. Signal Process..

[13]  Yi Yang,et al.  A fast unified algorithm for solving group-lasso penalize learning problems , 2014, Statistics and Computing.

[14]  Andreas Jakobsson,et al.  Hyperparameter selection for group-sparse regression: A probabilistic approach , 2018, Signal Process..

[15]  Cun-Hui Zhang,et al.  Scaled sparse linear regression , 2011, 1104.4595.

[16]  Alexandre Gramfort,et al.  Efficient Smoothed Concomitant Lasso Estimation for High Dimensional Regression , 2016, ArXiv.

[17]  Jian Li,et al.  Sparse Methods for Direction-of-Arrival Estimation , 2016, ArXiv.

[18]  A. Robert Calderbank,et al.  Conditioning of Random Block Subdictionaries With Applications to Block-Sparse Recovery and Regression , 2013, IEEE Transactions on Information Theory.

[19]  Genady Grabarnik,et al.  Sparse Modeling: Theory, Algorithms, and Applications , 2014 .

[20]  Antonin Chambolle,et al.  Stochastic Primal-Dual Hybrid Gradient Algorithm with Arbitrary Sampling and Imaging Applications , 2017, SIAM J. Optim..

[21]  Wotao Yin,et al.  Acceleration of Primal–Dual Methods by Preconditioning and Simple Subproblem Procedures , 2018, Journal of Scientific Computing.

[22]  Changjun Yu,et al.  A likelihood-based hyperparameter-free algorithm for robust block-sparse recovery , 2019, Signal Process..

[23]  Changjun Yu,et al.  Continuous Approximation Based Dimension-Reduced Estimation for Arbitrary Sampling , 2020, IEEE Signal Processing Letters.

[24]  J. Steele The Cauchy–Schwarz Master Class: References , 2004 .

[25]  Florentina Bunea,et al.  The Group Square-Root Lasso: Theoretical Properties and Fast Algorithms , 2013, IEEE Transactions on Information Theory.

[26]  Andreas Jakobsson,et al.  Group-sparse regression using the covariance fitting criterion , 2017, Signal Process..

[27]  P. Tseng,et al.  Block-Coordinate Gradient Descent Method for Linearly Constrained Nonsmooth Separable Optimization , 2009 .

[28]  Stephen P. Boyd,et al.  Proximal Algorithms , 2013, Found. Trends Optim..

[29]  Trevor Hastie,et al.  Statistical Learning with Sparsity: The Lasso and Generalizations , 2015 .

[30]  P. Tseng Convergence of a Block Coordinate Descent Method for Nondifferentiable Minimization , 2001 .

[31]  William W. Hager,et al.  Inexact alternating direction methods of multipliers for separable convex optimization , 2019, Comput. Optim. Appl..

[32]  Andreas Jakobsson,et al.  Generalized sparse covariance-based estimation , 2016, Signal Process..

[33]  Ritwik Mitra,et al.  The Benefit of Group Sparsity in Group Inference with De-biased Scaled Group Lasso , 2014, 1412.4170.

[34]  Jian Li,et al.  SPICE: A Sparse Covariance-Based Estimation Method for Array Processing , 2011, IEEE Transactions on Signal Processing.

[35]  Kim-Chuan Toh,et al.  An efficient Hessian based algorithm for solving large-scale sparse group Lasso problems , 2017, Mathematical Programming.

[36]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[37]  Petre Stoica,et al.  Online Hyperparameter-Free Sparse Estimation Method , 2015, IEEE Transactions on Signal Processing.

[38]  Li-Lian Wang,et al.  On Diagonal Dominance of FEM Stiffness Matrix of Fractional Laplacian and Maximum Principle Preserving Schemes for the Fractional Allen–Cahn Equation , 2020, Journal of Scientific Computing.

[39]  A. Belloni,et al.  Square-Root Lasso: Pivotal Recovery of Sparse Signals via Conic Programming , 2010, 1009.5689.

[40]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[41]  Min Li,et al.  Adaptive Primal-Dual Splitting Methods for Statistical Learning and Image Processing , 2015, NIPS.

[42]  Petre Stoica,et al.  SPICE and LIKES: Two hyperparameter-free methods for sparse-parameter estimation , 2012, Signal Process..

[43]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[44]  Hongbo Zhu,et al.  Efficient direction of arrival estimation based on sparse covariance fitting criterion with modeling mismatch , 2017, Signal Process..

[45]  Sara van de Geer,et al.  Ecole d'été de probabilités de Saint-Flour XLV , 2016 .