A Decentralized Proximal-Gradient Method With Network Independent Step-Sizes and Separated Convergence Rates

This paper proposes a novel proximal-gradient algorithm for a decentralized optimization problem with a composite objective containing smooth and nonsmooth terms. Specifically, the smooth and nonsmooth terms are dealt with by gradient and proximal updates, respectively. The proposed algorithm is closely related to a previous algorithm, PG-EXTRA (W. Shi, Q. Ling, G. Wu, and W. Yin, “A proximal gradient algorithm for decentralized composite optimization,” IEEE Trans. Signal Process., vol. 63, no. 22, pp. 6013–6023, 2015), but has a few advantages. First of all, agents use uncoordinated step-sizes, and the stable upper bounds on step-sizes are independent of network topologies. The step-sizes depend on local objective functions, and they can be as large as those of the gradient descent. Second, for the special case without nonsmooth terms, linear convergence can be achieved under the strong convexity assumption. The dependence of the convergence rate on the objective functions and the network are separated, and the convergence rate of the new algorithm is as good as one of the two convergence rates that match the typical rates for the general gradient descent and the consensus averaging. We provide numerical experiments to demonstrate the efficacy of the introduced algorithm and validate our theoretical discoveries.

[1]  Dimitri P. Bertsekas,et al.  Distributed asynchronous computation of fixed points , 1983, Math. Program..

[2]  John N. Tsitsiklis,et al.  Distributed Asynchronous Deterministic and Stochastic Gradient Optimization Algorithms , 1984, 1984 American Control Conference.

[3]  John N. Tsitsiklis,et al.  Parallel and distributed computation , 1989 .

[4]  Vivek S. Borkar,et al.  Distributed Asynchronous Incremental Subgradient Methods , 2001 .

[5]  D. Bertsekas,et al.  Convergen e Rate of In remental Subgradient Algorithms , 2000 .

[6]  Dimitri P. Bertsekas,et al.  Incremental Subgradient Methods for Nondifferentiable Optimization , 2001, SIAM J. Optim..

[7]  Robert Nowak,et al.  Distributed optimization in sensor networks , 2004, Third International Symposium on Information Processing in Sensor Networks, 2004. IPSN 2004.

[8]  Stephen P. Boyd,et al.  Fastest Mixing Markov Chain on a Graph , 2004, SIAM Rev..

[9]  Yurii Nesterov,et al.  Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[10]  Wei Ren,et al.  Consensus based formation control strategies for multi-vehicle systems , 2006, 2006 American Control Conference.

[11]  Stephen P. Boyd,et al.  Distributed average consensus with least-mean-square deviation , 2007, J. Parallel Distributed Comput..

[12]  Angelia Nedic,et al.  Distributed Non-Autonomous Power Control through Distributed Convex Optimization , 2009, IEEE INFOCOM 2009.

[13]  Jieping Ye,et al.  Large-scale sparse logistic regression , 2009, KDD.

[14]  Asuman E. Ozdaglar,et al.  Distributed Subgradient Methods for Multi-Agent Optimization , 2009, IEEE Transactions on Automatic Control.

[15]  Angelia Nedic,et al.  Incremental Stochastic Subgradient Algorithms for Convex Optimization , 2008, SIAM J. Optim..

[16]  John N. Tsitsiklis,et al.  On distributed averaging algorithms and quantization effects , 2007, 2008 47th IEEE Conference on Decision and Control.

[17]  Georgios B. Giannakis,et al.  Distributed Spectrum Sensing for Cognitive Radio Networks by Exploiting Sparsity , 2010, IEEE Transactions on Signal Processing.

[18]  Sonia Martínez,et al.  Discrete-time dynamic average consensus , 2010, Autom..

[19]  Alexander Olshevsky,et al.  Efficient information aggregation strategies for distributed control and signal processing , 2010, 1009.6036.

[20]  Georgios B. Giannakis,et al.  Consensus-Based Distributed Support Vector Machines , 2010, J. Mach. Learn. Res..

[21]  Angelia Nedic,et al.  Distributed Stochastic Subgradient Projection Algorithms for Convex Optimization , 2008, J. Optim. Theory Appl..

[22]  Angelia Nedic,et al.  Asynchronous Broadcast-Based Convex Optimization Over a Network , 2011, IEEE Transactions on Automatic Control.

[23]  R. Murray,et al.  Decentralized Multi-Agent Optimization via Dual Decomposition , 2011 .

[24]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[25]  Dimitri P. Bertsekas,et al.  Incremental proximal methods for large scale convex optimization , 2011, Math. Program..

[26]  Asuman E. Ozdaglar,et al.  A fast distributed proximal-gradient method , 2012, 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[27]  Asuman E. Ozdaglar,et al.  On the O(1=k) convergence of asynchronous distributed alternating Direction Method of Multipliers , 2013, 2013 IEEE Global Conference on Signal and Information Processing.

[28]  Angelia Nedic,et al.  Distributed optimization over time-varying directed graphs , 2013, 52nd IEEE Conference on Decision and Control.

[29]  Ufuk Topcu,et al.  Optimal decentralized protocol for electric vehicle charging , 2011, IEEE Transactions on Power Systems.

[30]  D. Bertsekas,et al.  Incremental Constraint Projection-Proximal Methods for Nonsmooth Convex Optimization , 2013 .

[31]  Kai Cai,et al.  Average Consensus on Arbitrary Strongly Connected Digraphs With Time-Varying Topologies , 2013, IEEE Transactions on Automatic Control.

[32]  José M. F. Moura,et al.  Fast Distributed Gradient Methods , 2011, IEEE Transactions on Automatic Control.

[33]  Qing Ling,et al.  On the Linear Convergence of the ADMM in Decentralized Consensus Optimization , 2013, IEEE Transactions on Signal Processing.

[34]  Alex Olshevsky,et al.  Linear Time Average Consensus on Fixed Graphs and Implications for Decentralized Optimization and Multi-Agent Control , 2014, 1411.4186.

[35]  Volkan Cevher,et al.  Convex Optimization for Big Data: Scalable, randomized, and parallel algorithms for big data analytics , 2014, IEEE Signal Processing Magazine.

[36]  Wotao Yin,et al.  ExtraPush for Convex Smooth Decentralized Optimization over Directed Networks , 2015, ArXiv.

[37]  Qing Ling,et al.  A Proximal Gradient Algorithm for Decentralized Composite Optimization , 2015, IEEE Transactions on Signal Processing.

[38]  Xiangfeng Wang,et al.  Multi-Agent Distributed Optimization via Inexact Consensus ADMM , 2014, IEEE Transactions on Signal Processing.

[39]  Qing Ling,et al.  EXTRA: An Exact First-Order Algorithm for Decentralized Consensus Optimization , 2014, 1404.6264.

[40]  Lihua Xie,et al.  Augmented distributed gradient methods for multi-agent optimization under uncoordinated constant stepsizes , 2015, 2015 54th IEEE Conference on Decision and Control (CDC).

[41]  Chenguang Xi,et al.  On the Linear Convergence of Distributed Optimization over Directed Graphs , 2015, 1510.02149.

[42]  Daniel Pérez Palomar,et al.  Distributed nonconvex multiagent optimization over time-varying networks , 2016, 2016 50th Asilomar Conference on Signals, Systems and Computers.

[43]  Gesualdo Scutari,et al.  NEXT: In-Network Nonconvex Optimization , 2016, IEEE Transactions on Signal and Information Processing over Networks.

[44]  Qing Ling,et al.  On the Convergence of Decentralized Gradient Descent , 2013, SIAM J. Optim..

[45]  Damek Davis,et al.  Convergence Rate Analysis of Several Splitting Schemes , 2014, 1406.4834.

[46]  Na Li,et al.  Harnessing smoothness to accelerate distributed optimization , 2016, 2016 IEEE 55th Conference on Decision and Control (CDC).

[47]  Wei Shi,et al.  Geometrically convergent distributed optimization with uncoordinated step-sizes , 2016, 2017 American Control Conference (ACC).

[48]  Usman A. Khan,et al.  DEXTRA: A Fast Algorithm for Optimization Over Directed Graphs , 2017, IEEE Transactions on Automatic Control.

[49]  Wei Shi,et al.  Achieving Geometric Convergence for Distributed Optimization Over Time-Varying Graphs , 2016, SIAM J. Optim..

[50]  Angelia Nedić,et al.  Fast Convergence Rates for Distributed Non-Bayesian Learning , 2015, IEEE Transactions on Automatic Control.

[51]  Mingyi Hong,et al.  Stochastic Proximal Gradient Consensus Over Random Networks , 2015, IEEE Transactions on Signal Processing.

[52]  Ming Yan,et al.  A primal-dual algorithm with optimal stepsizes and its application in decentralized consensus optimization , 2017, ArXiv.

[53]  Na Li,et al.  Accelerated distributed Nesterov Gradient Descent for convex and smooth functions , 2017, 2017 IEEE 56th Annual Conference on Decision and Control (CDC).

[54]  Alexander Olshevsky,et al.  Linear Time Average Consensus and Distributed Optimization on Fixed Graphs , 2017, SIAM J. Control. Optim..

[55]  Laurent Massoulié,et al.  Optimal Algorithms for Smooth and Strongly Convex Distributed Optimization in Networks , 2017, ICML.

[56]  Angelia Nedic,et al.  Optimal Algorithms for Distributed Optimization , 2017, ArXiv.

[57]  Ming Yan,et al.  A new primal-dual algorithm for minimizing the sum of three functions with a linear operator , 2016, 1611.09805.

[58]  Ali H. Sayed,et al.  Decentralized Consensus Optimization With Asynchrony and Delays , 2016, IEEE Transactions on Signal and Information Processing over Networks.

[59]  Ali H. Sayed,et al.  Exact Diffusion for Distributed Optimization and Learning—Part II: Convergence Analysis , 2017, IEEE Transactions on Signal Processing.

[60]  Ali H. Sayed,et al.  Exact Diffusion for Distributed Optimization and Learning—Part I: Algorithm Development , 2017, IEEE Transactions on Signal Processing.

[61]  Ali H. Sayed,et al.  Linear convergence of primal-dual gradient methods and their performance in distributed optimization , 2019, Autom..

[62]  P. Alam ‘W’ , 2021, Composites Engineering.