An Incremental Path-Following Splitting Method for Linearly Constrained Nonconvex Nonsmooth Programs

The linearly constrained nonconvex nonsmooth program has drawn much attention over the last few years due to its ubiquitous power of modeling in the area of machine learning. A variety of important problems, including deep learning, matrix factorization, and phase retrieval, can be reformulated as the problem of optimizing a highly nonconvex and nonsmooth objective function with some linear constraints. However, it is challenging to solve a linearly constrained nonconvex nonsmooth program, which is much complicated than its unconstrained counterpart. In fact, the feasible region is a polyhedron, where a simple projection is intractable in general. In addition, the per-iteration cost is extremely expensive for the high-dimensional case. Therefore, it has been recognized promising to develop a provable and practical algorithm for linearly constrained nonconvex nonsmooth programs. In this paper, we develop an incremental path-following splitting algorithm with a theoretical guarantee and a low computational cost. In specific, we show that this algorithm converges to an $\epsilon$-approximate stationary solution within $O(1/\epsilon)$ iterations, and that the per-iteration cost is very small for the randomized variable selection rule. To the best of our knowledge, this is the first incremental method to solve linearly constrained nonconvex nonsmooth programs with a theoretical guarantee. Experiments conducted on the constrained concave penalized linear regression (CCPLR) and nonconvex support vector machine (NCSVM) demonstrate that the proposed algorithm is more effective and stable than other competing heuristic methods.

[1]  Bingsheng He,et al.  Generalized alternating direction method of multipliers: new theoretical insights and applications , 2015, Math. Program. Comput..

[2]  Yurii Nesterov,et al.  Interior-point polynomial algorithms in convex programming , 1994, Siam studies in applied mathematics.

[3]  Patrick L. Combettes,et al.  Stochastic Quasi-Fejér Block-Coordinate Fixed Point Iterations with Random Sweeping , 2014 .

[4]  Jefferson G. Melo,et al.  Iteration-complexity of a Jacobi-type non-Euclidean ADMM for multi-block linearly constrained nonconvex programs , 2017, 1705.07229.

[5]  Alexander G. Gray,et al.  Stochastic Alternating Direction Method of Multipliers , 2013, ICML.

[6]  Jieping Ye,et al.  A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems , 2013, ICML.

[7]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[8]  Jorge Nocedal,et al.  On the limited memory BFGS method for large scale optimization , 1989, Math. Program..

[9]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[10]  Stephen P. Boyd,et al.  Simultaneous routing and resource allocation via dual decomposition , 2004, IEEE Transactions on Communications.

[11]  Songtao Lu,et al.  A nonconvex splitting method for symmetric nonnegative matrix factorization: Convergence analysis and optimality , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[12]  Songtao Lu,et al.  A Stochastic Nonconvex Splitting Method for Symmetric Nonnegative Matrix Factorization , 2017, AISTATS.

[13]  C. Lee Giles,et al.  Nonconvex Online Support Vector Machines , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  P. Lions,et al.  Splitting Algorithms for the Sum of Two Nonlinear Operators , 1979 .

[15]  James Renegar,et al.  A mathematical view of interior-point methods in convex optimization , 2001, MPS-SIAM series on optimization.

[16]  Wotao Yin,et al.  Global Convergence of ADMM in Nonconvex Nonsmooth Optimization , 2015, Journal of Scientific Computing.

[17]  Bingsheng He,et al.  On the Proximal Jacobian Decomposition of ALM for Multiple-Block Separable Convex Minimization Problems and Its Relationship to ADMM , 2016, J. Sci. Comput..

[18]  Daniel Pérez Palomar,et al.  A tutorial on decomposition methods for network utility maximization , 2006, IEEE Journal on Selected Areas in Communications.

[19]  Nicholas I. M. Gould,et al.  On the Complexity of Steepest Descent, Newton's and Regularized Newton's Methods for Nonconvex Unconstrained Optimization Problems , 2010, SIAM J. Optim..

[20]  Dimitri P. Bertsekas,et al.  On the Douglas—Rachford splitting method and the proximal point algorithm for maximal monotone operators , 1992, Math. Program..

[21]  Volkan Cevher,et al.  An Inexact Proximal Path-Following Algorithm for Constrained Convex Minimization , 2013, SIAM J. Optim..

[22]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[23]  Gareth M. James,et al.  The Constrained Lasso , 2012 .

[24]  Shai Shalev-Shwartz,et al.  On Graduated Optimization for Stochastic Non-Convex Problems , 2015, ICML.

[25]  Jefferson G. Melo,et al.  Convergence rate bounds for a proximal ADMM with over-relaxation stepsize parameter for solving nonconvex linearly constrained problems , 2017, 1702.01850.

[26]  Clóvis C. Gonzaga,et al.  Path-Following Methods for Linear Programming , 1992, SIAM Rev..

[27]  Shiqian Ma,et al.  Structured nonconvex and nonsmooth optimization: algorithms and iteration complexity analysis , 2016, Computational Optimization and Applications.

[28]  Dinh Quoc Tran,et al.  An Inexact Perturbed Path-Following Method for Lagrangian Decomposition in Large-Scale Separable Convex Optimization , 2011, SIAM J. Optim..

[29]  Zhi-Quan Luo,et al.  Convergence analysis of alternating direction method of multipliers for a family of nonconvex problems , 2014, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[30]  John N. Tsitsiklis,et al.  Parallel and distributed computation , 1989 .

[31]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[32]  Bo Peng,et al.  Methodologies and Algorithms on Some Non-convex Penalized Models for Ultra High Dimensional Data , 2016 .

[33]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.