Algorithms for sparse and low-rank optimization: convergence, complexity and applications

Solving optimization problems with sparse or low-rank optimal solutions has been an important topic since the recent emergence of compressed sensing and its matrix extensions such as the matrix rank minimization and robust principal component analysis problems. Compressed sensing enables one to recover a signal or image with fewer observations than the “length” of the signal or image, and thus provides potential breakthroughs in applications where data acquisition is costly. However, the potential impact of compressed sensing cannot be realized without efficient optimization algorithms that can handle extremely large-scale and dense data from real applications. Although the convex relaxations of these problems can be reformulated as either linear programming, second-order cone programming or semidefinite programming problems, the standard methods for solving these relaxations are not applicable because the problems are usually of huge size and contain dense data. In this dissertation, we give efficient algorithms for solving these “sparse” optimization problems and analyze the convergence and iteration complexity properties of these algorithms. Chapter 2 presents algorithms for solving the linearly constrained matrix rank minimization problem. The tightest convex relaxation of this problem is the linearly constrained nuclear norm minimization. Although the latter can be cast and solved as a semidefinite programming problem, such an approach is computationally expensive when the matrices are large. In Chapter 2, we propose fixed-point and Bregman iterative algorithms for solving the nuclear norm minimization problem and prove convergence of the first of these algorithms. By using a homotopy approach together with an approximate singular value decomposition procedure, we get a very fast, robust and powerful algorithm, which we call FPCA (Fixed Point Continuation with Approximate SVD), that can solve very large matrix rank minimization problems. Our numerical results on randomly generated and real matrix completion problems demonstrate that this algorithm is much faster and provides much better recoverability than semidefinite programming solvers such as SDPT3. For example, our algorithm can recover 1000 × 1000 matrices of rank 50 with a relative error of 10−5 in about 3 minutes by sampling only 20 percent of the elements. We know of no other method that achieves as good recoverability. Numerical experiments on online recommendation, DNA microarray data set and image inpainting problems demonstrate the effectiveness of our algorithms. In Chapter 3, we study the convergence/recoverability properties of the fixed-point continuation algorithm and its variants for matrix rank minimization. Heuristics for determining the rank of the matrix when its true rank is not known are also proposed. Some of these algorithms are closely related to greedy algorithms in compressed sensing. Numerical results for these algorithms for solving linearly constrained matrix rank minimization problems are reported. Chapters 4 and 5 considers alternating direction type methods for solving composite convex optimization problems. We present in Chapter 4 alternating linearization algorithms that are based on an alternating direction augmented Lagrangian approach for minimizing the sum of two convex functions. Our basic methods require at most O(1/e) iterations to obtain an e-optimal solution, while our accelerated (i.e., fast) versions require at most O(1/ 3 ) iterations, with little change in the computational effort required at each iteration. For more general problem, i.e., minimizing the sum of K convex functions, we propose multiple-splitting algorithms for solving them. We propose both basic and accelerated algorithms with O(1/e) and O(1/ 3 ) iteration complexity bounds for obtaining an e-optimal solution. To the best of our knowledge, the complexity results presented in these two chapters are the first ones of this type that have been given for splitting and alternating direction type methods. Numerical results on various applications in sparse and low-rank optimization, including compressed sensing, matrix completion, image deblurring, robust principal component analysis, are reported to demonstrate the efficiency of our methods.

[1]  Yi Ma,et al.  Robust principal component analysis? , 2009, JACM.

[2]  Y. Nesterov Gradient methods for minimizing composite objective function , 2007 .

[3]  S. Yun,et al.  An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems , 2009 .

[4]  David L. Donoho,et al.  Sparse Solution Of Underdetermined Linear Equations By Stagewise Orthogonal Matching Pursuit , 2006 .

[5]  Shiqian Ma,et al.  Fast Multiple-Splitting Algorithms for Convex Optimization , 2009, SIAM J. Optim..

[6]  H. H. Rachford,et al.  The Numerical Solution of Parabolic and Elliptic Differential Equations , 1955 .

[7]  Andrzej Ruszczynski,et al.  Proximal Decomposition Via Alternating Linearization , 1999, SIAM J. Optim..

[8]  Russ B. Altman,et al.  Missing value estimation methods for DNA microarrays , 2001, Bioinform..

[9]  Mike E. Davies,et al.  Gradient Pursuits , 2008, IEEE Transactions on Signal Processing.

[10]  Shiqian Ma,et al.  Fixed point and Bregman iterative methods for matrix rank minimization , 2009, Math. Program..

[11]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[12]  Laurent El Ghaoui,et al.  Rank Minimization under LMI constraints: A Framework for Output Feedback Problems , 2007 .

[13]  S. Yun,et al.  An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems , 2009 .

[14]  Shiqian Ma,et al.  An efficient algorithm for compressed MR imaging using total variation and wavelets , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  P. Tseng Applications of splitting algorithm to decomposition in convex programming and variational inequalities , 1991 .

[16]  Stephen P. Boyd,et al.  A rank minimization heuristic with application to minimum order system approximation , 2001, Proceedings of the 2001 American Control Conference. (Cat. No.01CH37148).

[17]  Alexandre d'Aspremont,et al.  Model Selection Through Sparse Max Likelihood Estimation Model Selection Through Sparse Maximum Likelihood Estimation for Multivariate Gaussian or Binary Data , 2022 .

[18]  Kenneth Y. Goldberg,et al.  Eigentaste: A Constant Time Collaborative Filtering Algorithm , 2001, Information Retrieval.

[19]  Jos F. Sturm,et al.  A Matlab toolbox for optimization over symmetric cones , 1999 .

[20]  Yoram Bresler,et al.  ADMiRA: Atomic Decomposition for Minimum Rank Approximation , 2009, IEEE Transactions on Information Theory.

[21]  Tommi S. Jaakkola,et al.  Weighted Low-Rank Approximations , 2003, ICML.

[22]  Yoram Bresler,et al.  Efficient and guaranteed rank minimization by atomic decomposition , 2009, 2009 IEEE International Symposium on Information Theory.

[23]  K. Schittkowski,et al.  NONLINEAR PROGRAMMING , 2022 .

[24]  D. Donoho,et al.  Fast Solution of -Norm Minimization Problems When the Solution May Be Sparse , 2008 .

[25]  Guillermo Sapiro,et al.  Image inpainting , 2000, SIGGRAPH.

[26]  P. Lions,et al.  Splitting Algorithms for the Sum of Two Nonlinear Operators , 1979 .

[27]  Robert D. Nowak,et al.  An EM algorithm for wavelet-based image restoration , 2003, IEEE Trans. Image Process..

[28]  H. H. Rachford,et al.  On the numerical solution of heat conduction problems in two and three space variables , 1956 .

[29]  Shiqian Ma,et al.  Convergence of Fixed-Point Continuation Algorithms for Matrix Rank Minimization , 2009, Found. Comput. Math..

[30]  Kim-Chuan Toh,et al.  Solving semidefinite-quadratic-linear programs using SDPT3 , 2003, Math. Program..

[31]  Xiaoming Yuan,et al.  Sparse and low-rank matrix decomposition via alternating direction method , 2013 .

[32]  Deanna Needell,et al.  CoSaMP: Iterative signal recovery from incomplete and inaccurate samples , 2008, ArXiv.

[33]  José M. Bioucas-Dias,et al.  Fast Image Recovery Using Variable Splitting and Constrained Optimization , 2009, IEEE Transactions on Image Processing.

[34]  J.-C. Pesquet,et al.  A Douglas–Rachford Splitting Approach to Nonsmooth Convex Variational Signal Recovery , 2007, IEEE Journal of Selected Topics in Signal Processing.

[35]  Benar Fux Svaiter,et al.  A family of projective splitting methods for the sum of two maximal monotone operators , 2007, Math. Program..

[36]  Paul Tseng,et al.  Further applications of a splitting algorithm to decomposition in variational inequalities and convex programming , 1990, Math. Program..

[37]  Xiaoming Yuan,et al.  Alternating Direction Methods for Sparse Covariance Selection * , 2009 .

[38]  Wotao Yin,et al.  TR 0707 A Fixed-Point Continuation Method for ` 1-Regularized Minimization with Applications to Compressed Sensing , 2007 .

[39]  G. Sapiro,et al.  A collaborative framework for 3D alignment and classification of heterogeneous subvolumes in cryo-electron tomography. , 2013, Journal of structural biology.

[40]  B. He,et al.  Alternating Direction Method with Self-Adaptive Penalty Parameters for Monotone Variational Inequalities , 2000 .

[41]  Nathan Srebro,et al.  Fast maximum margin matrix factorization for collaborative prediction , 2005, ICML.

[42]  Andrea Montanari,et al.  Matrix completion from a few entries , 2009, 2009 IEEE International Symposium on Information Theory.

[43]  Petros Drineas,et al.  FAST MONTE CARLO ALGORITHMS FOR MATRICES II: COMPUTING A LOW-RANK APPROXIMATION TO A MATRIX∗ , 2004 .

[44]  Emmanuel J. Candès,et al.  Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information , 2004, IEEE Transactions on Information Theory.

[45]  Bingsheng He,et al.  Alternating directions based contraction method for generally separable linearly constrained convex programming problems , 2009 .

[46]  Balas K. Natarajan,et al.  Sparse Approximate Solutions to Linear Systems , 1995, SIAM J. Comput..

[47]  R. Glowinski,et al.  Augmented Lagrangian and Operator-Splitting Methods in Nonlinear Mechanics , 1987 .

[48]  Michael P. Friedlander,et al.  Probing the Pareto Frontier for Basis Pursuit Solutions , 2008, SIAM J. Sci. Comput..

[49]  R. Tibshirani,et al.  Sparse inverse covariance estimation with the graphical lasso. , 2008, Biostatistics.

[50]  Paul Tseng,et al.  A Modified Forward-backward Splitting Method for Maximal Monotone Mappings 1 , 1998 .

[51]  D. Gabay Applications of the method of multipliers to variational inequalities , 1983 .

[52]  Francis R. Bach,et al.  Consistency of trace norm minimization , 2007, J. Mach. Learn. Res..

[53]  Nathan Linial,et al.  The geometry of graphs and some of its algorithmic applications , 1994, Proceedings 35th Annual Symposium on Foundations of Computer Science.

[54]  Yaakov Tsaig,et al.  Fast Solution of $\ell _{1}$ -Norm Minimization Problems When the Solution May Be Sparse , 2008, IEEE Transactions on Information Theory.

[55]  Emmanuel J. Candès,et al.  Exact Matrix Completion via Convex Optimization , 2008, Found. Comput. Math..

[56]  P. L. Combettes,et al.  Solving monotone inclusions via compositions of nonexpansive averaged operators , 2004 .

[57]  Franz Rendl,et al.  Regularization Methods for Semidefinite Programming , 2009, SIAM J. Optim..

[58]  Nathan Srebro,et al.  Learning with matrix factorizations , 2004 .

[59]  Renato D. C. Monteiro,et al.  Digital Object Identifier (DOI) 10.1007/s10107-004-0564-1 , 2004 .

[60]  Charles R. Johnson,et al.  Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[61]  M. Nikolova An Algorithm for Total Variation Minimization and Applications , 2004 .

[62]  Pablo A. Parrilo,et al.  Guaranteed Minimum-Rank Solutions of Linear Matrix Equations via Nuclear Norm Minimization , 2007, SIAM Rev..

[63]  M. Yuan,et al.  Model selection and estimation in the Gaussian graphical model , 2007 .

[64]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[65]  Y. Nesterov A method for unconstrained convex minimization problem with the rate of convergence o(1/k^2) , 1983 .

[66]  Wotao Yin,et al.  Alternating direction augmented Lagrangian methods for semidefinite programming , 2010, Math. Program. Comput..

[67]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[68]  Inderjit S. Dhillon,et al.  Guaranteed Rank Minimization via Singular Value Projection , 2009, NIPS.

[69]  J. Spingarn Partial inverse of a monotone operator , 1983 .

[70]  Olgica Milenkovic,et al.  Subspace Pursuit for Compressive Sensing Signal Reconstruction , 2008, IEEE Transactions on Information Theory.

[71]  A. N. Tikhonov,et al.  Solutions of ill-posed problems , 1977 .

[72]  Junfeng Yang,et al.  Alternating Direction Algorithms for 1-Problems in Compressive Sensing , 2009, SIAM J. Sci. Comput..

[73]  Bingsheng He,et al.  A new inexact alternating directions method for monotone variational inequalities , 2002, Math. Program..

[74]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[75]  Mário A. T. Figueiredo,et al.  Gradient Projection for Sparse Reconstruction: Application to Compressed Sensing and Other Inverse Problems , 2007, IEEE Journal of Selected Topics in Signal Processing.

[76]  Patrick L. Combettes,et al.  Signal Recovery by Proximal Forward-Backward Splitting , 2005, Multiscale Model. Simul..

[77]  Martin J. Wainwright,et al.  High-Dimensional Graphical Model Selection Using ℓ1-Regularized Logistic Regression , 2006, NIPS.

[78]  Wotao Yin,et al.  An Iterative Regularization Method for Total Variation-Based Image Restoration , 2005, Multiscale Model. Simul..

[79]  Qi Tian,et al.  Statistical modeling of complex backgrounds for foreground object detection , 2004, IEEE Transactions on Image Processing.

[80]  Emmanuel J. Candès,et al.  The Power of Convex Relaxation: Near-Optimal Matrix Completion , 2009, IEEE Transactions on Information Theory.

[81]  Yurii Nesterov,et al.  Smooth minimization of non-smooth functions , 2005, Math. Program..

[82]  L. Rudin,et al.  Nonlinear total variation based noise removal algorithms , 1992 .

[83]  Yi Ma,et al.  The Augmented Lagrange Multiplier Method for Exact Recovery of Corrupted Low-Rank Matrices , 2010, Journal of structural biology.

[84]  B. Mercier,et al.  A dual algorithm for the solution of nonlinear variational problems via finite element approximation , 1976 .

[85]  L. Bregman The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming , 1967 .

[86]  Shiqian Ma,et al.  Sparse Inverse Covariance Selection via Alternating Linearization Methods , 2010, NIPS.

[87]  Lieven Vandenberghe,et al.  Interior-Point Method for Nuclear Norm Approximation with Application to System Identification , 2009, SIAM J. Matrix Anal. Appl..

[88]  Wotao Yin,et al.  Bregman Iterative Algorithms for (cid:2) 1 -Minimization with Applications to Compressed Sensing ∗ , 2008 .

[89]  E. Candes,et al.  11-magic : Recovery of sparse signals via convex programming , 2005 .

[90]  Yin Zhang,et al.  Fixed-Point Continuation for l1-Minimization: Methodology and Convergence , 2008, SIAM J. Optim..

[91]  Benar Fux Svaiter,et al.  General Projective Splitting Methods for Sums of Maximal Monotone Operators , 2009, SIAM J. Control. Optim..

[92]  Yoram Bresler,et al.  Guaranteed Minimum Rank Approximation from Linear Observations by Nuclear Norm Minimization with an Ellipsoidal Constraint , 2009, ArXiv.

[93]  Jonathan Eckstein Splitting methods for monotone operators with applications to parallel optimization , 1989 .

[94]  Renato D. C. Monteiro,et al.  A nonlinear programming algorithm for solving semidefinite programs via low-rank factorization , 2003, Math. Program..

[95]  Adrian S. Lewis,et al.  Convex Analysis And Nonlinear Optimization , 2000 .

[96]  Kim-Chuan Toh,et al.  SDPT3 -- A Matlab Software Package for Semidefinite Programming , 1996 .

[97]  Yin Zhang,et al.  A Fast Algorithm for Sparse Reconstruction Based on Shrinkage, Subspace Optimization, and Continuation , 2010, SIAM J. Sci. Comput..

[98]  Emmanuel J. Candès,et al.  A Singular Value Thresholding Algorithm for Matrix Completion , 2008, SIAM J. Optim..

[99]  Tom Goldstein,et al.  The Split Bregman Method for L1-Regularized Problems , 2009, SIAM J. Imaging Sci..

[100]  Shiqian Ma,et al.  Fast alternating linearization methods for minimizing the sum of two convex functions , 2009, Math. Program..

[101]  Mike E. Davies,et al.  Iterative Hard Thresholding for Compressed Sensing , 2008, ArXiv.

[102]  Joel A. Tropp,et al.  Just relax: convex programming methods for identifying sparse signals in noise , 2006, IEEE Transactions on Information Theory.

[103]  Dimitri P. Bertsekas,et al.  On the Douglas—Rachford splitting method and the proximal point algorithm for maximal monotone operators , 1992, Math. Program..