Parallel Multi-Block ADMM with o(1 / k) Convergence

This paper introduces a parallel and distributed algorithm for solving the following minimization problem with linear constraints: $$\begin{aligned} \text {minimize} ~~&f_1(\mathbf{x}_1) + \cdots + f_N(\mathbf{x}_N)\\ \text {subject to}~~&A_1 \mathbf{x}_1 ~+ \cdots + A_N\mathbf{x}_N =c,\\&\mathbf{x}_1\in {\mathcal {X}}_1,~\ldots , ~\mathbf{x}_N\in {\mathcal {X}}_N, \end{aligned}$$minimizef1(x1)+⋯+fN(xN)subject toA1x1+⋯+ANxN=c,x1∈X1,…,xN∈XN,where $$N \ge 2$$N≥2, $$f_i$$fi are convex functions, $$A_i$$Ai are matrices, and $${\mathcal {X}}_i$$Xi are feasible sets for variable $$\mathbf{x}_i$$xi. Our algorithm extends the alternating direction method of multipliers (ADMM) and decomposes the original problem into N smaller subproblems and solves them in parallel at each iteration. This paper shows that the classic ADMM can be extended to the N-block Jacobi fashion and preserve convergence in the following two cases: (i) matrices $$A_i$$Ai are mutually near-orthogonal and have full column-rank, or (ii) proximal terms are added to the N subproblems (but without any assumption on matrices $$A_i$$Ai). In the latter case, certain proximal terms can let the subproblem be solved in more flexible and efficient ways. We show that $$\Vert {\mathbf {x}}^{k+1} - {\mathbf {x}}^k\Vert _M^2$$‖xk+1-xk‖M2 converges at a rate of o(1 / k) where M is a symmetric positive semi-definte matrix. Since the parameters used in the convergence analysis are conservative, we introduce a strategy for automatically tuning the parameters to substantially accelerate our algorithm in practice. We implemented our algorithm (for the case ii above) on Amazon EC2 and tested it on basis pursuit problems with >300 GB of distributed data. This is the first time that successfully solving a compressive sensing problem of such a large scale is reported.

[1]  Harvey J. Everett Generalized Lagrange Multiplier Method for Solving Problems of Optimum Allocation of Resources , 1963 .

[2]  Miss A.O. Penney (b) , 1974, The New Yale Book of Quotations.

[3]  R. Glowinski,et al.  Sur l'approximation, par éléments finis d'ordre un, et la résolution, par pénalisation-dualité d'une classe de problèmes de Dirichlet non linéaires , 1975 .

[4]  B. Mercier,et al.  A dual algorithm for the solution of nonlinear variational problems via finite element approximation , 1976 .

[5]  丸山 徹 Convex Analysisの二,三の進展について , 1977 .

[6]  P. Lions,et al.  Splitting Algorithms for the Sum of Two Nonlinear Operators , 1979 .

[7]  R. Rockafellar MONOTROPIC PROGRAMMING: DESCENT ALGORITHMS AND DUALITY , 1981 .

[8]  R. Glowinski,et al.  Numerical Methods for Nonlinear Variational Problems , 1985 .

[9]  John N. Tsitsiklis,et al.  Parallel and distributed computation , 1989 .

[10]  Marc Teboulle,et al.  A proximal-based decomposition method for convex minimization problems , 1994, Math. Program..

[11]  B. He A class of projection and contraction methods for monotone variational inequalities , 1997 .

[12]  Yurii Nesterov,et al.  Smooth minimization of non-smooth functions , 2005, Math. Program..

[13]  D. Bertsekas Extended Monotropic Programming and Duality , 2008 .

[14]  Bingsheng He,et al.  Parallel splitting augmented Lagrangian methods for monotone structured variational inequalities , 2009, Comput. Optim. Appl..

[15]  John Wright,et al.  RASL: Robust alignment by sparse and low-rank decomposition for linearly correlated images , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[16]  Pablo A. Parrilo,et al.  Latent variable graphical model selection via convex optimization , 2010, 2010 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[17]  Stanley Osher,et al.  A Unified Primal-Dual Algorithm Framework Based on Bregman Iteration , 2010, J. Sci. Comput..

[18]  Junfeng Yang,et al.  Alternating Direction Algorithms for 1-Problems in Compressive Sensing , 2009, SIAM J. Sci. Comput..

[19]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[20]  Xiaoming Yuan,et al.  Recovering Low-Rank and Sparse Components of Matrices from Incomplete and Noisy Observations , 2011, SIAM J. Optim..

[21]  Bingsheng He,et al.  On the O(1/n) Convergence Rate of the Douglas-Rachford Alternating Direction Method , 2012, SIAM J. Numer. Anal..

[22]  Xiaoming Yuan,et al.  A Note on the Alternating Direction Method of Multipliers , 2012, J. Optim. Theory Appl..

[23]  Bingsheng He,et al.  Linearized Alternating Direction Method with Gaussian Back Substitution for Separable Convex Programming , 2011 .

[24]  Asuman E. Ozdaglar,et al.  On the O(1=k) convergence of asynchronous distributed alternating Direction Method of Multipliers , 2013, 2013 IEEE Global Conference on Signal and Information Processing.

[25]  João M. F. Xavier,et al.  D-ADMM: A Communication-Efficient Distributed Algorithm for Separable Optimization , 2012, IEEE Transactions on Signal Processing.

[26]  Ming Yan,et al.  Parallel and distributed sparse optimization , 2013, 2013 Asilomar Conference on Signals, Systems and Computers.

[27]  Caihua Chen,et al.  On the Convergence Analysis of the Alternating Direction Method of Multipliers with Three Blocks , 2013 .

[28]  Shiqian Ma,et al.  Solving Multiple-Block Separable Convex Minimization Problems Using Two-Block Alternating Direction Method of Multipliers , 2013, ArXiv.

[29]  Tianyi Lin,et al.  On the Convergence Rate of Multi-Block ADMM , 2014 .

[30]  Xiaoming Yuan,et al.  A Generalized Proximal Point Algorithm and Its Convergence Rate , 2014, SIAM J. Optim..

[31]  Richard G. Baraniuk,et al.  Fast Alternating Direction Optimization Methods , 2014, SIAM J. Imaging Sci..

[32]  Stephen P. Boyd,et al.  Block splitting for distributed optimization , 2013, Mathematical Programming Computation.

[33]  Kim-Chuan Toh,et al.  A Convergent 3-Block Semi-Proximal ADMM for Convex Minimization Problems with One Strongly Convex Block , 2014, Asia Pac. J. Oper. Res..

[34]  Bingsheng He,et al.  On Full Jacobian Decomposition of the Augmented Lagrangian Method for Separable Convex Programming , 2015, SIAM J. Optim..

[35]  Bingsheng He,et al.  On non-ergodic convergence rate of Douglas–Rachford alternating direction method of multipliers , 2014, Numerische Mathematik.

[36]  Damek Davis,et al.  A Three-Operator Splitting Scheme and its Optimization Applications , 2015, 1504.01032.

[37]  Wotao Yin,et al.  On the Global and Linear Convergence of the Generalized Alternating Direction Method of Multipliers , 2016, J. Sci. Comput..

[38]  Franziska Wulf,et al.  Minimization Methods For Non Differentiable Functions , 2016 .

[39]  Damek Davis,et al.  Convergence Rate Analysis of Several Splitting Schemes , 2014, 1406.4834.

[40]  Bingsheng He,et al.  The direct extension of ADMM for multi-block convex minimization problems is not necessarily convergent , 2014, Mathematical Programming.

[41]  Wotao Yin,et al.  Faster Convergence Rates of Relaxed Peaceman-Rachford and ADMM Under Regularity Assumptions , 2014, Math. Oper. Res..

[42]  Zhi-Quan Luo,et al.  On the linear convergence of the alternating direction method of multipliers , 2012, Mathematical Programming.