ADMM for multiaffine constrained optimization

ABSTRACT We expand the scope of the alternating direction method of multipliers (ADMM). Specifically, we show that ADMM, when employed to solve problems with multiaffine constraints that satisfy certain verifiable assumptions, converges to the set of constrained stationary points if the penalty parameter in the augmented Lagrangian is sufficiently large. When the Kurdyka–Łojasiewicz (K–Ł) property holds, this is strengthened to convergence to a single constrained stationary point. Our analysis applies under assumptions that we have endeavoured to make as weak as possible. It applies to problems that involve nonconvex and/or nonsmooth objective terms, in addition to the multiaffine constraints that can involve multiple (three or more) blocks of variables. To illustrate the applicability of our results, we describe examples including nonnegative matrix factorization, sparse learning, risk parity portfolio selection, nonconvex formulations of convex problems and neural network training. In each case, our ADMM approach encounters only subproblems that have closed-form solutions.

[1]  Ying Cui,et al.  On the Convergence Properties of a Majorized Alternating Direction Method of Multipliers for Linearly Constrained Convex Optimization Problems with Coupled Objective Functions , 2016, J. Optim. Theory Appl..

[2]  Yi Ma,et al.  Robust principal component analysis? , 2009, JACM.

[3]  BolteJérôme,et al.  Proximal Alternating Minimization and Projection Methods for Nonconvex Problems , 2010 .

[4]  Kim-Chuan Toh,et al.  A Schur complement based semi-proximal ADMM for convex quadratic conic programming and extensions , 2014, Mathematical Programming.

[5]  Shiqian Ma,et al.  Primal-dual optimization algorithms over Riemannian manifolds: an iteration complexity analysis , 2017, Mathematical Programming.

[6]  Benar Fux Svaiter,et al.  Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward–backward splitting, and regularized Gauss–Seidel methods , 2013, Math. Program..

[7]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[8]  Marc Teboulle,et al.  Nonconvex Lagrangian-Based Optimization: Monitoring Schemes and Global Convergence , 2018, Math. Oper. Res..

[9]  Ernest K. Ryu Uniqueness of DRS as the 2 operator resolvent-splitting and impossibility of 3 operator resolvent-splitting , 2018, Math. Program..

[10]  Stephen Becker,et al.  Adapting Regularized Low-Rank Models for Parallel Architectures , 2017, SIAM J. Sci. Comput..

[11]  Xiaoming Yuan,et al.  A Note on the Alternating Direction Method of Multipliers , 2012, J. Optim. Theory Appl..

[12]  P. Lions,et al.  Splitting Algorithms for the Sum of Two Nonlinear Operators , 1979 .

[13]  Zhixun Su,et al.  Linearized alternating direction method with parallel splitting and adaptive penalty for separable convex programs in machine learning , 2013, Machine Learning.

[14]  Yurii Nesterov,et al.  Smooth minimization of non-smooth functions , 2005, Math. Program..

[15]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[16]  David P. Williamson,et al.  Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming , 1995, JACM.

[17]  Zhi-Quan Luo,et al.  Convergence analysis of alternating direction method of multipliers for a family of nonconvex problems , 2015, ICASSP.

[18]  Qing Ling,et al.  On the Linear Convergence of the ADMM in Decentralized Consensus Optimization , 2013, IEEE Transactions on Signal Processing.

[19]  Wotao Yin,et al.  Global Convergence of ADMM in Nonconvex Nonsmooth Optimization , 2015, Journal of Scientific Computing.

[20]  Katya Scheinberg,et al.  Alternating direction methods for non convex optimization with applications to second-order least-squares and risk parity portfolio selection , 2015 .

[21]  B. Mercier,et al.  A dual algorithm for the solution of nonlinear variational problems via finite element approximation , 1976 .

[22]  Mia Hubert,et al.  Robust PCA and classification in biosciences , 2004, Bioinform..

[23]  Jonathan Eckstein,et al.  Understanding the Convergence of the Alternating Direction Method of Multipliers: Theoretical and Computational Perspectives , 2015 .

[24]  Xiangfeng Wang,et al.  Nonnegative matrix factorization using ADMM: Algorithm and convergence analysis , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[25]  John Wright,et al.  Complete Dictionary Recovery Over the Sphere II: Recovery by Riemannian Trust-Region Method , 2015, IEEE Transactions on Information Theory.

[26]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[27]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[28]  Tianyi Lin,et al.  On the Convergence Rate of Multi-Block ADMM , 2014, 1408.4265.

[29]  Paul Tseng,et al.  Hankel Matrix Rank Minimization with Applications to System Identification and Realization , 2013, SIAM J. Matrix Anal. Appl..

[30]  John Wright,et al.  Complete Dictionary Recovery Over the Sphere I: Overview and the Geometric Picture , 2015, IEEE Transactions on Information Theory.

[31]  Kim-Chuan Toh,et al.  A note on the convergence of ADMM for linearly constrained convex optimization problems , 2015, Computational Optimization and Applications.

[32]  Shiqian Ma,et al.  Alternating direction method of multipliers for real and complex polynomial optimization models , 2014 .

[33]  Yin Zhang,et al.  An alternating direction algorithm for matrix completion with nonnegative factors , 2011, Frontiers of Mathematics in China.

[34]  Michael Elad,et al.  Image Denoising Via Sparse and Redundant Representations Over Learned Dictionaries , 2006, IEEE Transactions on Image Processing.

[35]  Kim-Chuan Toh,et al.  A Majorized ADMM with Indefinite Proximal Terms for Linearly Constrained Convex Composite Optimization , 2014, SIAM J. Optim..

[36]  Richard M. Karp,et al.  Reducibility Among Combinatorial Problems , 1972, 50 Years of Integer Programming.

[37]  Quan Pan,et al.  Semi-coupled dictionary learning with applications to image super-resolution and photo-sketch synthesis , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[38]  Kim-Chuan Toh,et al.  An efficient inexact symmetric Gauss–Seidel based majorized ADMM for high-dimensional convex composite conic programming , 2015, Mathematical Programming.

[39]  Shiqian Ma,et al.  Structured nonconvex and nonsmooth optimization: algorithms and iteration complexity analysis , 2016, Computational Optimization and Applications.

[40]  Shiqian Ma,et al.  Global Convergence of Unmodified 3-Block ADMM for a Class of Convex Minimization Problems , 2015, Journal of Scientific Computing.

[41]  Zheng Xu,et al.  Training Neural Networks Without Gradients: A Scalable ADMM Approach , 2016, ICML.

[42]  Wotao Yin,et al.  Parallel Multi-Block ADMM with o(1 / k) Convergence , 2013, Journal of Scientific Computing.

[43]  Danna Zhou,et al.  d. , 1934, Microbial pathogenesis.

[44]  Renato D. C. Monteiro,et al.  A nonlinear programming algorithm for solving semidefinite programs via low-rank factorization , 2003, Math. Program..

[45]  Bingsheng He,et al.  The direct extension of ADMM for multi-block convex minimization problems is not necessarily convergent , 2014, Mathematical Programming.

[46]  Guoyin Li,et al.  Global Convergence of Splitting Methods for Nonconvex Composite Optimization , 2014, SIAM J. Optim..

[47]  Hongchun Sun,et al.  Improved proximal ADMM with partially parallel splitting for multi-block separable convex programming , 2018 .

[48]  Kim-Chuan Toh,et al.  A Convergent 3-Block Semi-Proximal ADMM for Convex Minimization Problems with One Strongly Convex Block , 2014, Asia Pac. J. Oper. Res..

[49]  Wen Song,et al.  Algorithms for solving the inverse problem associated with KAK=As+1 , 2017, J. Comput. Appl. Math..

[50]  Yurii Nesterov,et al.  Cubic regularization of Newton method and its global performance , 2006, Math. Program..

[51]  Shiqian Ma,et al.  On the Global Linear Convergence of the ADMM with MultiBlock Variables , 2014, SIAM J. Optim..

[52]  Richard M. Karp,et al.  Reducibility among combinatorial problems" in complexity of computer computations , 1972 .

[53]  Alan D. Sokal,et al.  A Really Simple Elementary Proof of the Uniform Boundedness Theorem , 2010, Am. Math. Mon..

[54]  Damek Davis,et al.  A Three-Operator Splitting Scheme and its Optimization Applications , 2015, 1504.01032.

[55]  Radu Ioan Bot,et al.  The Proximal Alternating Direction Method of Multipliers in the Nonconvex Setting: Convergence Analysis and Rates , 2018, Math. Oper. Res..

[56]  Shu Lu,et al.  Implications of the constant rank constraint qualification , 2011, Math. Program..

[57]  Kim-Chuan Toh,et al.  A Convergent 3-Block SemiProximal Alternating Direction Method of Multipliers for Conic Programming with 4-Type Constraints , 2014, SIAM J. Optim..

[58]  Hédy Attouch,et al.  Proximal Alternating Minimization and Projection Methods for Nonconvex Problems: An Approach Based on the Kurdyka-Lojasiewicz Inequality , 2008, Math. Oper. Res..

[59]  Dimitri P. Bertsekas,et al.  On the Douglas—Rachford splitting method and the proximal point algorithm for maximal monotone operators , 1992, Math. Program..

[60]  Bastian Goldlücke,et al.  Variational Analysis , 2014, Computer Vision, A Reference Guide.

[61]  R. Janin Directional derivative of the marginal function in nonlinear programming , 1984 .

[62]  Guillermo Sapiro,et al.  Online Learning for Matrix Factorization and Sparse Coding , 2009, J. Mach. Learn. Res..