On first-order algorithms for l1/nuclear norm minimization

In the past decade, problems related to l1/nuclear norm minimization have attracted much attention in the signal processing, machine learning and optimization communities. In this paper, devoted to l1/nuclear norm minimization as ‘optimization beasts’, we give a detailed description of two attractive first-order optimization techniques for solving problems of this type. The first one, aimed primarily at lasso-type problems, comprises fast gradient methods applied to composite minimization formulations. The second approach, aimed at Dantzig-selector-type problems, utilizes saddle-point first-order algorithms and reformulation of the problem of interest as a generalized bilinear saddle-point problem. For both approaches, we give complete and detailed complexity analyses and discuss the application domains.

[1]  F. Santosa,et al.  Linear inversion of ban limit reflection seismograms , 1986 .

[2]  L. Rudin,et al.  Nonlinear total variation based noise removal algorithms , 1992 .

[3]  Arkadi S. Nemirovsky,et al.  Information-based complexity of linear operator equations , 1992, J. Complex..

[4]  P. Schmieder,et al.  Application of nonlinear sampling schemes to COSY-type spectra , 1993, Journal of biomolecular NMR.

[5]  Yurii Nesterov,et al.  Interior-point polynomial algorithms in convex programming , 1994, Siam studies in applied mathematics.

[6]  Leonid Khachiyan,et al.  A sublinear-time randomized approximation algorithm for matrix games , 1995, Oper. Res. Lett..

[7]  Yurii Nesterov,et al.  New variants of bundle methods , 1995, Math. Program..

[8]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[9]  Paul Tseng,et al.  A Modified Forward-backward Splitting Method for Maximal Monotone Mappings 1 , 1998 .

[10]  Marc Teboulle,et al.  Mirror descent and nonlinear projected subgradient methods for convex optimization , 2003, Oper. Res. Lett..

[11]  Y. Nesterov Dual Extrapolation and its Applications for Solving Variational Inequalities and Related Problems' , 2003 .

[12]  Arkadi Nemirovski,et al.  Prox-Method with Rate of Convergence O(1/t) for Variational Inequalities with Lipschitz Continuous Monotone Operators and Smooth Convex-Concave Saddle Point Problems , 2004, SIAM J. Optim..

[13]  M. Nikolova An Algorithm for Total Variation Minimization and Applications , 2004 .

[14]  Sanjeev Arora,et al.  Fast algorithms for approximate semidefinite programming using the multiplicative weights update method , 2005, 46th Annual IEEE Symposium on Foundations of Computer Science (FOCS'05).

[15]  E. Candès,et al.  Stable signal recovery from incomplete and inaccurate measurements , 2005, math/0503066.

[16]  Adi Shraibman,et al.  Rank, Trace-Norm and Max-Norm , 2005, COLT.

[17]  Yurii Nesterov,et al.  Excessive Gap Technique in Nonsmooth Convex Minimization , 2005, SIAM J. Optim..

[18]  Yurii Nesterov,et al.  Smooth minimization of non-smooth functions , 2005, Math. Program..

[19]  Emmanuel J. Candès,et al.  Decoding by linear programming , 2005, IEEE Transactions on Information Theory.

[20]  D. Donoho,et al.  Sparse nonnegative solution of underdetermined linear equations by linear programming. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[21]  E.J. Candes Compressive Sampling , 2022 .

[22]  Michael Elad,et al.  Stable recovery of sparse overcomplete representations in the presence of noise , 2006, IEEE Transactions on Information Theory.

[23]  Yurii Nesterov,et al.  Cubic regularization of Newton method and its global performance , 2006, Math. Program..

[24]  Joel A. Tropp,et al.  Just relax: convex programming methods for identifying sparse signals in noise , 2006, IEEE Transactions on Information Theory.

[25]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[26]  Yurii Nesterov,et al.  Dual extrapolation and its applications to solving variational inequalities and related problems , 2003, Math. Program..

[27]  Sanjeev Arora,et al.  A combinatorial, primal-dual approach to semidefinite programs , 2007, STOC '07.

[28]  Y. Nesterov Gradient methods for minimizing composite objective function , 2007 .

[29]  Terence Tao,et al.  The Dantzig selector: Statistical estimation when P is much larger than n , 2005, math/0506081.

[30]  Massimiliano Pontil,et al.  Convex multi-task feature learning , 2008, Machine Learning.

[31]  D. Donoho,et al.  Sparse MRI: The application of compressed sensing for rapid MR imaging , 2007, Magnetic resonance in medicine.

[32]  R. DeVore,et al.  Compressed sensing and best k-term approximation , 2008 .

[33]  Alexandre d'Aspremont,et al.  Subsampling algorithms for semidefinite programming , 2008, 0803.1990.

[34]  A. Rinaldo,et al.  On the asymptotic properties of the group lasso estimator for linear models , 2008 .

[35]  Wotao Yin,et al.  Bregman Iterative Algorithms for (cid:2) 1 -Minimization with Applications to Compressed Sensing ∗ , 2008 .

[36]  C. Chesneau,et al.  Some theoretical results on the Grouped Variables Lasso , 2008 .

[37]  Babak Hassibi,et al.  Recovering Sparse Signals Using Sparse Measurement Matrices in Compressed DNA Microarrays , 2008, IEEE Journal of Selected Topics in Signal Processing.

[38]  Jean Ponce,et al.  Convex Sparse Matrix Factorizations , 2008, ArXiv.

[39]  A. Juditsky,et al.  Solving variational inequalities with Stochastic Mirror-Prox algorithm , 2008, 0809.0815.

[40]  P. Bühlmann,et al.  The group lasso for logistic regression , 2008 .

[41]  Thomas Strohmer,et al.  High-Resolution Radar via Compressed Sensing , 2008, IEEE Transactions on Signal Processing.

[42]  N. Meinshausen,et al.  LASSO-TYPE RECOVERY OF SPARSE REPRESENTATIONS FOR HIGH-DIMENSIONAL DATA , 2008, 0806.0145.

[43]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[44]  Yurii Nesterov,et al.  Primal-dual subgradient methods for convex problems , 2005, Math. Program..

[45]  Han Liu,et al.  Estimation Consistency of the Group Lasso and its Applications , 2009, AISTATS.

[46]  Wotao Yin,et al.  Parametric Maximum Flow Algorithms for Fast Total Variation Minimization , 2009, SIAM J. Sci. Comput..

[47]  Marc Teboulle,et al.  Fast Gradient-Based Algorithms for Constrained Total Variation Image Denoising and Deblurring Problems , 2009, IEEE Transactions on Image Processing.

[48]  P. Bickel,et al.  SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR , 2008, 0801.1095.

[49]  Junzhou Huang,et al.  The Benefit of Group Sparsity , 2009 .

[50]  Richard G. Baraniuk,et al.  Compressive Sensing DNA Microarrays , 2008, EURASIP J. Bioinform. Syst. Biol..

[51]  Emmanuel J. Candès,et al.  A Singular Value Thresholding Algorithm for Matrix Completion , 2008, SIAM J. Optim..

[52]  Ruslan Salakhutdinov,et al.  Practical Large-Scale Optimization for Max-norm Regularization , 2010, NIPS.

[53]  Pablo A. Parrilo,et al.  Guaranteed Minimum-Rank Solutions of Linear Matrix Equations via Nuclear Norm Minimization , 2007, SIAM Rev..

[54]  Martin Jaggi,et al.  A Simple Algorithm for Nuclear Norm Regularized Problems , 2010, ICML.

[55]  Wotao Yin,et al.  A Fast Hybrid Algorithm for Large-Scale l1-Regularized Logistic Regression , 2010, J. Mach. Learn. Res..

[56]  Yin Zhang,et al.  A Fast Algorithm for Sparse Reconstruction Based on Shrinkage, Subspace Optimization, and Continuation , 2010, SIAM J. Sci. Comput..

[57]  Han Liu,et al.  The Group Dantzig Selector , 2010, AISTATS.

[58]  Yonina C. Eldar,et al.  Block-Sparse Signals: Uncertainty Relations and Efficient Recovery , 2009, IEEE Transactions on Signal Processing.

[59]  Shiqian Ma,et al.  Sparse Inverse Covariance Selection via Alternating Linearization Methods , 2010, NIPS.

[60]  Emmanuel J. Candès,et al.  The Power of Convex Relaxation: Near-Optimal Matrix Completion , 2009, IEEE Transactions on Information Theory.

[61]  Volkan Cevher,et al.  Model-Based Compressive Sensing , 2008, IEEE Transactions on Information Theory.

[62]  M. Lustig,et al.  Improved pediatric MR imaging with compressed sensing. , 2010, Radiology.

[63]  S. Geer,et al.  Oracle Inequalities and Optimal Inference under Group Sparsity , 2010, 1007.1771.

[64]  Arkadi Nemirovski,et al.  On verifiable sufficient conditions for sparse signal recovery via ℓ1 minimization , 2008, Math. Program..

[65]  P. Parrilo,et al.  Design of photonic crystals with multiple and combined band gaps. , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.

[66]  Emmanuel J. Candès,et al.  Templates for convex cone problems with applications to sparse signal recovery , 2010, Math. Program. Comput..

[67]  Stephen J. Wright,et al.  Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent , 2011, NIPS.

[68]  Shiqian Ma,et al.  Convergence of Fixed-Point Continuation Algorithms for Matrix Rank Minimization , 2009, Found. Comput. Math..

[69]  Shiqian Ma,et al.  Fixed point and Bregman iterative methods for matrix rank minimization , 2009, Math. Program..

[70]  Boris Polyak,et al.  Accuracy guaranties for $\ell_1$ recovery of block-sparse signals , 2011, 1111.2546.

[71]  O. Devolder,et al.  Stochastic first order methods in smooth convex optimization , 2011 .

[72]  Anatoli B. Juditsky,et al.  Verifiable conditions of ℓ1-recovery for sparse signals with sign restrictions , 2009, Math. Program..

[73]  Arkadi Nemirovski,et al.  Accuracy Guarantees for ℓ1-Recovery , 2010, IEEE Trans. Inf. Theory.

[74]  Weiyu Xu,et al.  Null space conditions and thresholds for rank minimization , 2011, Math. Program..

[75]  Bin Dong,et al.  Fast Linearized Bregman Iteration for Compressive Sensing and Sparse Denoising , 2011, ArXiv.

[76]  Pablo A. Parrilo,et al.  Rank-Sparsity Incoherence for Matrix Decomposition , 2009, SIAM J. Optim..

[77]  Arkadi Nemirovski,et al.  On the accuracy of l1-filtering of signals with block-sparse structure , 2011, NIPS.

[78]  Emmanuel J. Candès,et al.  Exact Matrix Completion via Convex Optimization , 2008, Found. Comput. Math..

[79]  Boris Polyak,et al.  Accuracy guaranties for ` 1 recovery of block-sparse signals , 2012 .

[80]  Pablo A. Parrilo,et al.  The Convex Geometry of Linear Inverse Problems , 2010, Foundations of Computational Mathematics.

[81]  Donald Goldfarb,et al.  2 A Variable-Splitting Augmented Lagrangian Framework , 2011 .

[82]  Yong-Jin Liu,et al.  An implementable proximal point algorithmic framework for nuclear norm minimization , 2012, Math. Program..

[83]  René Vidal,et al.  Block-Sparse Recovery via Convex Optimization , 2011, IEEE Transactions on Signal Processing.

[84]  E. Candès,et al.  Compressive fluorescence microscopy for biological and hyperspectral imaging , 2012, Proceedings of the National Academy of Sciences.

[85]  Wotao Yin,et al.  On the convergence of an active-set method for ℓ1 minimization , 2012, Optim. Methods Softw..

[86]  Arkadi Nemirovski,et al.  A Randomized Mirror-Prox Method for Solving Structured Large-Scale Matrix Saddle-Point Problems , 2011, SIAM J. Optim..

[87]  Junfeng Yang,et al.  Linearized augmented Lagrangian and alternating direction methods for nuclear norm minimization , 2012, Math. Comput..

[88]  Shiqian Ma,et al.  Accelerated Linearized Bregman Method , 2011, J. Sci. Comput..

[89]  Alexander Shapiro,et al.  Stochastic Approximation approach to Stochastic Programming , 2013 .

[90]  Christopher Ré,et al.  Parallel stochastic gradient algorithms for large-scale matrix completion , 2013, Mathematical Programming Computation.

[91]  Arkadi Nemirovski,et al.  Randomized first order algorithms with applications to ℓ1-minimization , 2013, Math. Program..

[92]  Katya Scheinberg,et al.  Fast First-Order Methods for Composite Convex Optimization with Backtracking , 2014, Found. Comput. Math..