Distributed Algorithms for Composite Optimization: Unified Framework and Convergence Analysis

We study distributed composite optimization over networks: agents minimize a sum of smooth (strongly) convex functions–the agents’ sum-utility–plus a nonsmooth (extended-valued) convex one. We propose a general unified algorithmic framework for such a class of problems and provide a convergence analysis leveraging the theory of operator splitting. Distinguishing features of our scheme are: (i) When each of the agent’s functions is strongly convex, the algorithm converges at a linear rate, whose dependence on the agents’ functions and network topology is decoupled; (ii) When the objective function is convex (but not strongly convex), similar decoupling as in (i) is established for the coefficient of the proved sublinear rate. This also reveals the role of function heterogeneity on the convergence rate. (iii) The algorithm can adjust the ratio between the number of communications and computations to achieve a rate (in terms of computations) independent on the network connectivity; and (iv) A by-product of our analysis is a tuning recommendation for several existing (non-accelerated) distributed algorithms yielding provably faster (worst-case) convergence rate for the class of problems under consideration.

[1]  Louis A. Hageman,et al.  Iterative Solution of Large Linear Systems. , 1971 .

[2]  Stephen P. Boyd,et al.  Fast linear iterations for distributed averaging , 2003, 42nd IEEE International Conference on Decision and Control (IEEE Cat. No.03CH37475).

[3]  Yurii Nesterov,et al.  Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[4]  Martin J. Wainwright,et al.  Fast global convergence rates of gradient methods for high-dimensional statistical recovery , 2010, NIPS.

[5]  Asuman E. Ozdaglar,et al.  Distributed Alternating Direction Method of Multipliers , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).

[6]  Ohad Shamir,et al.  Communication-Efficient Distributed Optimization using an Approximate Newton-type Method , 2013, ICML.

[7]  Qing Ling,et al.  A Proximal Gradient Algorithm for Decentralized Composite Optimization , 2015, IEEE Transactions on Signal Processing.

[8]  Qing Ling,et al.  EXTRA: An Exact First-Order Algorithm for Decentralized Consensus Optimization , 2014, 1404.6264.

[9]  Lihua Xie,et al.  Augmented distributed gradient methods for multi-agent optimization under uncoordinated constant stepsizes , 2015, 2015 54th IEEE Conference on Decision and Control (CDC).

[10]  Gesualdo Scutari,et al.  NEXT: In-Network Nonconvex Optimization , 2016, IEEE Transactions on Signal and Information Processing over Networks.

[11]  Na Li,et al.  Harnessing smoothness to accelerate distributed optimization , 2016, 2016 IEEE 55th Conference on Decision and Control (CDC).

[12]  Wei Shi,et al.  Geometrically convergent distributed optimization with uncoordinated step-sizes , 2016, 2017 American Control Conference (ACC).

[13]  Wei Shi,et al.  Achieving Geometric Convergence for Distributed Optimization Over Time-Varying Graphs , 2016, SIAM J. Optim..

[14]  Bin Hu,et al.  Robust convergence analysis of distributed optimization algorithms , 2017, 2017 55th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[15]  Laurent Massoulié,et al.  Optimal Algorithms for Smooth and Strongly Convex Distributed Optimization in Networks , 2017, ICML.

[16]  N. S. Aybat,et al.  Distributed Linearized Alternating Direction Method of Multipliers for Composite Convex Consensus Optimization , 2015, IEEE Transactions on Automatic Control.

[17]  Lihua Xie,et al.  Convergence of Asynchronous Distributed Gradient Methods Over Stochastic Networks , 2018, IEEE Transactions on Automatic Control.

[18]  Lihua Xie,et al.  A Bregman Splitting Scheme for Distributed Optimization Over Networks , 2018, IEEE Transactions on Automatic Control.

[19]  Sulaiman A. Alghunaim,et al.  A Linearly Convergent Proximal Gradient Algorithm for Decentralized Optimization , 2019, NeurIPS.

[20]  Wei Shi,et al.  A Decentralized Proximal-Gradient Method With Network Independent Step-Sizes and Separated Convergence Rates , 2017, IEEE Transactions on Signal Processing.

[21]  Ying Sun,et al.  Convergence Rate of Distributed Optimization Algorithms Based on Gradient Tracking , 2019, ArXiv.

[22]  Ali H. Sayed,et al.  Exact Diffusion for Distributed Optimization and Learning—Part II: Convergence Analysis , 2017, IEEE Transactions on Signal Processing.

[23]  Gesualdo Scutari,et al.  Distributed nonconvex constrained optimization over time-varying digraphs , 2018, Mathematical Programming.

[24]  Ali H. Sayed,et al.  Exact Diffusion for Distributed Optimization and Learning—Part I: Algorithm Development , 2017, IEEE Transactions on Signal Processing.

[25]  Ermin Wei,et al.  A General Framework of Exact Primal-Dual First-Order Algorithms for Distributed Optimization ∗ , 2019, 2019 IEEE 58th Conference on Decision and Control (CDC).

[26]  Ying Sun,et al.  A Unified Contraction Analysis of a Class of Distributed Algorithms for Composite Optimization , 2019, 2019 IEEE 8th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP).

[27]  Dusan Jakovetic,et al.  A Unification and Generalization of Exact Distributed First-Order Methods , 2017, IEEE Transactions on Signal and Information Processing over Networks.

[28]  Bryan Van Scoy,et al.  A Canonical Form for First-Order Distributed Optimization Algorithms , 2018, 2019 American Control Conference (ACC).

[29]  Bryan Van Scoy,et al.  Distributed Optimization of Nonconvex Functions over Time-Varying Graphs , 2019 .

[30]  Ying Sun,et al.  Accelerated Primal-Dual Algorithms for Distributed Smooth Convex Optimization over Networks , 2019, AISTATS.

[31]  Ali H. Sayed,et al.  Decentralized Proximal Gradient Algorithms With Linear Convergence Rates , 2019, IEEE Transactions on Automatic Control.