Accelerated distributed Nesterov Gradient Descent for convex and smooth functions

This paper considers the distributed optimization problem over a network, where the objective is to optimize a global function formed by an average of local functions, using only local computation and communication. We develop an Accelerated Distributed Nesterov Gradient Descent (Acc-DNGD) method for convex and smooth objective functions. We show that it achieves a O(1/t1.4-ε) (∀ε ε (0,1.4)) convergence rate when a vanishing step size is used. The convergence rate can be improved to O(1/t2) when we use a fixed step size and the objective functions satisfy a special property. To the best of our knowledge, Acc-DNGD is the fastest among all distributed gradient-based algorithms that have been proposed so far.

[1]  John N. Tsitsiklis,et al.  Parallel and distributed computation , 1989 .

[2]  Na Li,et al.  Harnessing smoothness to accelerate distributed optimization , 2016, 2016 IEEE 55th Conference on Decision and Control (CDC).

[3]  Usman A. Khan,et al.  ADD-OPT: Accelerated Distributed Directed Optimization , 2016, IEEE Transactions on Automatic Control.

[4]  Lihua Xie,et al.  Augmented distributed gradient methods for multi-agent optimization under uncoordinated constant stepsizes , 2015, 2015 54th IEEE Conference on Decision and Control (CDC).

[5]  Georgios B. Giannakis,et al.  Consensus-Based Distributed Support Vector Machines , 2010, J. Mach. Learn. Res..

[6]  Björn Johansson,et al.  On Distributed Optimization in Networked Systems , 2008 .

[7]  Yurii Nesterov,et al.  First-order methods of smooth convex optimization with inexact oracle , 2013, Mathematical Programming.

[8]  Angelia Nedic,et al.  Distributed Stochastic Subgradient Projection Algorithms for Convex Optimization , 2008, J. Optim. Theory Appl..

[9]  Reza Olfati-Saber,et al.  Consensus and Cooperation in Networked Multi-Agent Systems , 2007, Proceedings of the IEEE.

[10]  Chenguang Xi,et al.  On the Linear Convergence of Distributed Optimization over Directed Graphs , 2015, 1510.02149.

[11]  Gesualdo Scutari,et al.  Distributed nonconvex optimization over networks , 2015, 2015 IEEE 6th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP).

[12]  Angelia Nedic,et al.  Stochastic Gradient-Push for Strongly Convex Functions on Time-Varying Directed Graphs , 2014, IEEE Transactions on Automatic Control.

[13]  Yurii Nesterov,et al.  Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[14]  Wotao Yin,et al.  ExtraPush for Convex Smooth Decentralized Optimization over Directed Networks , 2015, ArXiv.

[15]  Alex Olshevsky,et al.  Linear Time Average Consensus on Fixed Graphs and Implications for Decentralized Optimization and Multi-Agent Control , 2014, 1411.4186.

[16]  Wei Shi,et al.  Achieving Geometric Convergence for Distributed Optimization Over Time-Varying Graphs , 2016, SIAM J. Optim..

[17]  Wei Shi,et al.  Geometrically convergent distributed optimization with uncoordinated step-sizes , 2016, 2017 American Control Conference (ACC).

[18]  Martin J. Wainwright,et al.  Dual Averaging for Distributed Optimization: Convergence Analysis and Network Scaling , 2010, IEEE Transactions on Automatic Control.

[19]  Na Li,et al.  Accelerated Distributed Nesterov Gradient Descent for smooth and strongly convex functions , 2016, 2016 54th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[20]  Sonia Martínez,et al.  On Distributed Convex Optimization Under Inequality and Equality Constraints , 2010, IEEE Transactions on Automatic Control.

[21]  Paul Erdös,et al.  On random graphs, I , 1959 .

[22]  Angelia Nedic,et al.  Distributed Optimization Over Time-Varying Directed Graphs , 2015, IEEE Trans. Autom. Control..

[23]  A. Ozdaglar,et al.  Convergence analysis of distributed subgradient methods over random networks , 2008, 2008 46th Annual Allerton Conference on Communication, Control, and Computing.

[24]  Asuman E. Ozdaglar,et al.  Distributed Subgradient Methods for Multi-Agent Optimization , 2009, IEEE Transactions on Automatic Control.

[25]  Georgios B. Giannakis,et al.  Distributed Spectrum Sensing for Cognitive Radio Networks by Exploiting Sparsity , 2010, IEEE Transactions on Signal Processing.

[26]  John S. Baras,et al.  Performance Evaluation of the Consensus-Based Distributed Subgradient Method Under Random Communication Topologies , 2011, IEEE Journal of Selected Topics in Signal Processing.

[27]  Gesualdo Scutari,et al.  NEXT: In-Network Nonconvex Optimization , 2016, IEEE Transactions on Signal and Information Processing over Networks.

[28]  José M. F. Moura,et al.  Fast Distributed Gradient Methods , 2011, IEEE Transactions on Automatic Control.

[29]  John N. Tsitsiklis,et al.  Distributed Asynchronous Deterministic and Stochastic Gradient Optimization Algorithms , 1984, 1984 American Control Conference.

[30]  John N. Tsitsiklis,et al.  Convergence Speed in Distributed Consensus and Averaging , 2009, SIAM J. Control. Optim..

[31]  Asuman E. Ozdaglar,et al.  Distributed multi-agent optimization with state-dependent communication , 2010, Math. Program..

[32]  Qing Ling,et al.  EXTRA: An Exact First-Order Algorithm for Decentralized Consensus Optimization , 2014, 1404.6264.

[33]  Dimitri P. Bertsekas,et al.  Nonlinear Programming , 1997 .