A Generalized Matrix Splitting Algorithm

Composite function minimization captures a wide spectrum of applications in both computer vision and machine learning. It includes bound constrained optimization, $\ell_1$ norm regularized optimization, and $\ell_0$ norm regularized optimization as special cases. This paper proposes and analyzes a new Generalized Matrix Splitting Algorithm (GMSA) for minimizing composite functions. It can be viewed as a generalization of the classical Gauss-Seidel method and the Successive Over-Relaxation method for solving linear systems in the literature. Our algorithm is derived from a novel triangle operator mapping, which can be computed exactly using a new generalized Gaussian elimination procedure. We establish the global convergence, convergence rate, and iteration complexity of GMSA for convex problems. In addition, we also discuss several important extensions of GMSA. Finally, we validate the performance of our proposed method on three particular applications: nonnegative matrix factorization, $\ell_0$ norm regularized sparse coding, and $\ell_1$ norm regularized Dantzig selector problem. Extensive experiments show that our method achieves state-of-the-art performance in term of both efficiency and efficacy.

[1]  Paul Tseng,et al.  A block coordinate gradient descent method for regularized convex separable optimization and covariance selection , 2011, Math. Program..

[2]  Rajat Raina,et al.  Efficient sparse coding algorithms , 2006, NIPS.

[3]  Chih-Jen Lin,et al.  Projected Gradient Methods for Nonnegative Matrix Factorization , 2007, Neural Computation.

[4]  Paul Tseng,et al.  Approximation accuracy, gradient methods, and error bound for structured convex optimization , 2010, Math. Program..

[5]  James Demmel,et al.  Applied Numerical Linear Algebra , 1997 .

[6]  Chih-Jen Lin,et al.  A dual coordinate descent method for large-scale linear SVM , 2008, ICML '08.

[7]  David J. Field,et al.  Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[8]  Yong Xu,et al.  Sparse Coding for Classification via Discrimination Ensemble , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Stephen P. Boyd,et al.  Proximal Algorithms , 2013, Found. Trends Optim..

[10]  Lin Xiao,et al.  An Accelerated Randomized Proximal Coordinate Gradient Method and its Application to Regularized Empirical Risk Minimization , 2015, SIAM J. Optim..

[11]  Wei Hong Yang,et al.  An Efficient Matrix Splitting Method for the Second-Order Cone Complementarity Problem , 2014, SIAM J. Optim..

[12]  Yurii Nesterov,et al.  Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[13]  Inderjit S. Dhillon,et al.  Fast coordinate descent methods with variable selection for non-negative matrix factorization , 2011, KDD.

[14]  Amir Beck,et al.  On the Convergence of Block Coordinate Descent Type Methods , 2013, SIAM J. Optim..

[15]  Peter Richtárik,et al.  Iteration complexity of randomized block-coordinate descent methods for minimizing a composite function , 2011, Mathematical Programming.

[16]  Z.-Q. Luo,et al.  Error bounds and convergence analysis of feasible descent methods: a general approach , 1993, Ann. Oper. Res..

[17]  Paul Tseng,et al.  A coordinate gradient descent method for nonsmooth separable minimization , 2008, Math. Program..

[18]  P. Tseng,et al.  On the convergence of a matrix splitting algorithm for the symmetric monotone linear complementarity problem , 1991 .

[19]  Alfredo N. Iusem On the convergence of iterative methods for symmetric linear complementarity problems , 1993, Math. Program..

[20]  Zongben Xu,et al.  $L_{1/2}$ Regularization: A Thresholding Representation Theory and a Fast Solver , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[21]  Lin Xiao,et al.  Randomized Block Coordinate Non-Monotone Gradient Method for a Class of Nonlinear Programming , 2013, ArXiv.

[22]  Zhigang Luo,et al.  NeNMF: An Optimal Gradient Method for Nonnegative Matrix Factorization , 2012, IEEE Transactions on Signal Processing.

[23]  Stephen J. Wright,et al.  Asynchronous Stochastic Coordinate Descent: Parallelism and Convergence Properties , 2014, SIAM J. Optim..

[24]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[25]  Zuowei Shen,et al.  Dictionary Learning for Sparse Coding: Algorithms and Convergence Analysis , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Paul Tseng,et al.  Error Bound and Convergence Analysis of Matrix Splitting Algorithms for the Affine Variational Inequality Problem , 1992, SIAM J. Optim..

[27]  A. Bruckstein,et al.  K-SVD : An Algorithm for Designing of Overcomplete Dictionaries for Sparse Representation , 2005 .

[28]  Mingyi Hong,et al.  Improved Iteration Complexity Bounds of Cyclic Block Coordinate Descent for Convex Problems , 2015, NIPS.

[29]  Haesun Park,et al.  Fast Nonnegative Matrix Factorization: An Active-Set-Like Method and Comparisons , 2011, SIAM J. Sci. Comput..

[30]  Xiangfeng Wang,et al.  The Linearized Alternating Direction Method of Multipliers for Dantzig Selector , 2012, SIAM J. Sci. Comput..

[31]  David L Donoho,et al.  Compressed sensing , 2006, IEEE Transactions on Information Theory.

[32]  M. Elad,et al.  $rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation , 2006, IEEE Transactions on Signal Processing.

[33]  Bingsheng He,et al.  On non-ergodic convergence rate of Douglas–Rachford alternating direction method of multipliers , 2014, Numerische Mathematik.

[34]  Bingsheng He,et al.  On the O(1/n) Convergence Rate of the Douglas-Rachford Alternating Direction Method , 2012, SIAM J. Numer. Anal..

[35]  Chih-Jen Lin,et al.  Dual coordinate descent methods for logistic regression and maximum entropy models , 2011, Machine Learning.

[36]  Bernard Ghanem,et al.  A Matrix Splitting Method for Composite Function Minimization , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Tong Zhang,et al.  Accelerating Stochastic Gradient Descent using Predictive Variance Reduction , 2013, NIPS.

[38]  Yousef Saad,et al.  Iterative methods for sparse linear systems , 2003 .

[39]  R. Vidal,et al.  Sparse Subspace Clustering: Algorithm, Theory, and Applications. , 2013, IEEE transactions on pattern analysis and machine intelligence.

[40]  Dimitri P. Bertsekas,et al.  Nonlinear Programming , 1997 .

[41]  Nebojsa Jojic,et al.  -Sparse Subspace Clustering , 2016 .

[42]  Zhi-Quan Luo,et al.  Iteration complexity analysis of block coordinate descent methods , 2013, Mathematical Programming.

[43]  Yurii Nesterov,et al.  Efficiency of Coordinate Descent Methods on Huge-Scale Optimization Problems , 2012, SIAM J. Optim..

[44]  Bernard Ghanem,et al.  ℓ0TV: A new method for image restoration in the presence of impulse noise , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Terence Tao,et al.  The Dantzig selector: Statistical estimation when P is much larger than n , 2005, math/0506081.

[46]  Shiqian Ma,et al.  GSOS: Gauss-Seidel Operator Splitting Algorithm for Multi-Term Nonsmooth Convex Composite Optimization , 2017, ICML.

[47]  Joel A. Tropp,et al.  Signal Recovery From Random Measurements Via Orthogonal Matching Pursuit , 2007, IEEE Transactions on Information Theory.

[48]  Zongben Xu,et al.  Fast image deconvolution using closed-form thresholding formulas of regularization , 2013, J. Vis. Commun. Image Represent..

[49]  Qingshan Liu,et al.  Newton Greedy Pursuit: A Quadratic Approximation Method for Sparsity-Constrained Optimization , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[50]  I. Necoara,et al.  Iteration complexity analysis of random coordinate descent methods for $\ell_0$ regularized convex problems , 2014, 1403.6622.

[51]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[52]  Lin Xiao,et al.  On the complexity analysis of randomized block-coordinate descent methods , 2013, Mathematical Programming.

[53]  Jinshan Zeng,et al.  GAITA: A Gauss-Seidel iterative thresholding algorithm for ℓq regularized least squares regression , 2017, J. Comput. Appl. Math..