A Block Successive Upper-Bound Minimization Method of Multipliers for Linearly Constrained Convex Optimization

Consider the problem of minimizing the sum of a smooth convex function and a separable nonsmooth convex function subject to linear coupling constraints. Problems of this form arise in many contemporary applications including signal processing, wireless networking and smart grid provisioning. Motivated by the huge size of these applications, we propose a new class of first order primal-dual algorithms called the block successive upper-bound minimization method of multipliers (BSUM-M) to solve this family of problems. The BSUM-M updates the primal variable blocks successively by minimizing locally tight upper-bounds of the augmented Lagrangian of the original problem, followed by a gradient type update for the dual variable in closed form. We show that under certain regularity conditions, and when the primal block variables are updated in either a deterministic or a random fashion, the BSUM-M converges to the set of optimal solutions. Moreover, in the absence of linear constraints, we show that the BSUM-M, which reduces to the block successive upper-bound minimization (BSUM) method, is capable of linear convergence without strong convexity.

[1]  Stephen J. Wright,et al.  Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent , 2011, NIPS.

[2]  Z. Luo,et al.  On the Linear Convergence of the Approximate Proximal Splitting Method for Non-smooth Convex Optimization , 2014, 1404.5350.

[3]  P. Tseng Convergence of a Block Coordinate Descent Method for Nondifferentiable Minimization , 2001 .

[4]  R. Glowinski,et al.  Augmented Lagrangian and Operator-Splitting Methods in Nonlinear Mechanics , 1987 .

[5]  John N. Tsitsiklis,et al.  Parallel and distributed computation , 1989 .

[6]  B. Mercier,et al.  A dual algorithm for the solution of nonlinear variational problems via finite element approximation , 1976 .

[7]  Yurii Nesterov,et al.  Efficiency of Coordinate Descent Methods on Huge-Scale Optimization Problems , 2012, SIAM J. Optim..

[8]  P. Tseng,et al.  On the linear convergence of descent methods for convex essentially smooth minimization , 1992 .

[9]  Stanley Osher,et al.  A Unified Primal-Dual Algorithm Framework Based on Bregman Iteration , 2010, J. Sci. Comput..

[10]  Xiaoming Yuan,et al.  Recovering Low-Rank and Sparse Components of Matrices from Incomplete and Noisy Observations , 2011, SIAM J. Optim..

[11]  Shiqian Ma,et al.  Alternating Proximal Gradient Method for Convex Minimization , 2015, Journal of Scientific Computing.

[12]  Shiqian Ma,et al.  Sparse Inverse Covariance Selection via Alternating Linearization Methods , 2010, NIPS.

[13]  Wotao Yin,et al.  On the Global and Linear Convergence of the Generalized Alternating Direction Method of Multipliers , 2016, J. Sci. Comput..

[14]  Mikhail V. Solodov,et al.  On the Convergence of Constrained Parallel Variable Distribution Algorithms , 1998, SIAM J. Optim..

[15]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[16]  Z.-Q. Luo,et al.  Error bounds and convergence analysis of feasible descent methods: a general approach , 1993, Ann. Oper. Res..

[17]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[18]  Zhi-Quan Luo,et al.  On the linear convergence of the alternating direction method of multipliers , 2012, Mathematical Programming.

[19]  Peter Richtárik,et al.  Iteration complexity of randomized block-coordinate descent methods for minimizing a composite function , 2011, Mathematical Programming.

[20]  Ambuj Tewari,et al.  On the Nonasymptotic Convergence of Cyclic Coordinate Descent Methods , 2013, SIAM J. Optim..

[21]  Y. Nesterov Gradient methods for minimizing composite objective function , 2007 .

[22]  Dimitri P. Bertsekas,et al.  On the Douglas—Rachford splitting method and the proximal point algorithm for maximal monotone operators , 1992, Math. Program..

[23]  Wotao Yin,et al.  Bregman Iterative Algorithms for (cid:2) 1 -Minimization with Applications to Compressed Sensing ∗ , 2008 .

[24]  Renato D. C. Monteiro,et al.  Iteration-Complexity of Block-Decomposition Algorithms and the Alternating Direction Method of Multipliers , 2013, SIAM J. Optim..

[25]  Yin Zhang,et al.  Fixed-Point Continuation for l1-Minimization: Methodology and Convergence , 2008, SIAM J. Optim..

[26]  Asuman Ozdaglar,et al.  Cooperative distributed multi-agent optimization , 2010, Convex Optimization in Signal Processing and Communications.

[27]  Luigi Grippo,et al.  On the convergence of the block nonlinear Gauss-Seidel method under convex constraints , 2000, Oper. Res. Lett..

[28]  Zhi-Quan Luo,et al.  On the Linear Convergence of the Proximal Gradient Method for Trace Norm Regularization , 2013, NIPS.

[29]  Ambuj Tewari,et al.  Stochastic methods for l1 regularized loss minimization , 2009, ICML '09.

[30]  Z. Luo,et al.  On the Linear Convergence of a Proximal Gradient Method for a Class of Nonsmooth Convex Minimization Problems , 2013 .

[31]  Amir Beck,et al.  On the Convergence of Block Coordinate Descent Type Methods , 2013, SIAM J. Optim..

[32]  Zhi-Quan Luo,et al.  A Unified Convergence Analysis of Block Successive Minimization Methods for Nonsmooth Optimization , 2012, SIAM J. Optim..

[33]  Shiqian Ma,et al.  Solving Multiple-Block Separable Convex Minimization Problems Using Two-Block Alternating Direction Method of Multipliers , 2013, ArXiv.

[34]  Stephen J. Wright,et al.  Sparse Reconstruction by Separable Approximation , 2008, IEEE Transactions on Signal Processing.

[35]  Daniel Boley Linear Convergence of ADMM on a Model Problem , 2012 .

[36]  James M. Ortega,et al.  Iterative solution of nonlinear equations in several variables , 2014, Computer science and applied mathematics.

[37]  Jonathan Eckstein Some Saddle-function splitting methods for convex programming , 1994 .

[38]  R. Glowinski,et al.  Sur l'approximation, par éléments finis d'ordre un, et la résolution, par pénalisation-dualité d'une classe de problèmes de Dirichlet non linéaires , 1975 .

[39]  Paul Tseng,et al.  Approximation accuracy, gradient methods, and error bound for structured convex optimization , 2010, Math. Program..

[40]  Shiqian Ma,et al.  Fast alternating linearization methods for minimizing the sum of two convex functions , 2009, Math. Program..

[41]  Francisco Facchinei,et al.  Flexible parallel algorithms for big data optimization , 2013, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[42]  Mingyi Hong,et al.  Equilibrium Pricing of Interference in Cognitive Radio Networks , 2011, IEEE Transactions on Signal Processing.

[43]  Anna Scaglione,et al.  Demand-Side Management in the Smart Grid: Information Processing for the Power Switch , 2012, IEEE Signal Processing Magazine.

[44]  Bingsheng He,et al.  The direct extension of ADMM for multi-block convex minimization problems is not necessarily convergent , 2014, Mathematical Programming.

[45]  Brian M. Sadler,et al.  A Survey of Dynamic Spectrum Access , 2007, IEEE Signal Processing Magazine.

[46]  Junfeng Yang,et al.  Alternating Direction Algorithms for 1-Problems in Compressive Sensing , 2009, SIAM J. Sci. Comput..

[47]  Na Li,et al.  Optimal demand response based on utility maximization in power networks , 2011, 2011 IEEE Power and Energy Society General Meeting.

[48]  Anna Scaglione,et al.  Coordinated home energy management for real-time power balancing , 2012, 2012 IEEE Power and Energy Society General Meeting.

[49]  Junfeng Yang,et al.  An Efficient TVL1 Algorithm for Deblurring Multichannel Images Corrupted by Impulsive Noise , 2009, SIAM J. Sci. Comput..

[50]  P. Tseng,et al.  On the convergence of the coordinate descent method for convex differentiable minimization , 1992 .

[51]  Xiangfeng Wang,et al.  The Linearized Alternating Direction Method of Multipliers for Dantzig Selector , 2012, SIAM J. Sci. Comput..

[52]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[53]  Dimitri P. Bertsekas,et al.  Constrained Optimization and Lagrange Multiplier Methods , 1982 .

[54]  Bingsheng He,et al.  A new inexact alternating directions method for monotone variational inequalities , 2002, Math. Program..

[55]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[56]  Patrick L. Combettes,et al.  Proximal Splitting Methods in Signal Processing , 2009, Fixed-Point Algorithms for Inverse Problems in Science and Engineering.

[57]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[58]  Bingsheng He,et al.  Linearized Alternating Direction Method with Gaussian Back Substitution for Separable Convex Programming , 2011 .

[59]  Paul Tseng,et al.  A coordinate gradient descent method for nonsmooth separable minimization , 2008, Math. Program..

[60]  H. Robbins,et al.  A Convergence Theorem for Non Negative Almost Supermartingales and Some Applications , 1985 .

[61]  S. Barbarossa,et al.  MIMO cognitive radio: A game theoretical approach , 2008, SPAWC 2008.

[62]  Michael A. Saunders,et al.  Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..

[63]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[64]  Allen Y. Yang,et al.  Fast ℓ1-minimization algorithms and an application in robust face recognition: A review , 2010, 2010 IEEE International Conference on Image Processing.

[65]  Bingsheng He,et al.  On the O(1/n) Convergence Rate of the Douglas-Rachford Alternating Direction Method , 2012, SIAM J. Numer. Anal..

[66]  Jukka Paatero,et al.  A model for generating household electricity load profiles , 2006 .

[67]  Richard G. Baraniuk,et al.  Fast Alternating Direction Optimization Methods , 2014, SIAM J. Imaging Sci..