论文信息 - Accelerated Distributed Nesterov Gradient Descent

Accelerated Distributed Nesterov Gradient Descent

This paper considers the distributed optimization problem over a network, where the objective is to optimize a global function formed by a sum of local functions, using only local computation and communication. We develop an accelerated distributed Nesterov gradient descent method. When the objective function is convex and <inline-formula><tex-math notation="LaTeX">$L$</tex-math></inline-formula>-smooth, we show that it achieves a <inline-formula><tex-math notation="LaTeX">$O(\frac{1}{t^{1.4-\epsilon }})$</tex-math></inline-formula> convergence rate for all <inline-formula><tex-math notation="LaTeX">$\epsilon \in (0,1.4)$</tex-math></inline-formula>. We also show the convergence rate can be improved to <inline-formula><tex-math notation="LaTeX">$O(\frac{1}{t^2})$</tex-math></inline-formula> if the objective function is a composition of a linear map and a strongly convex and smooth function. When the objective function is <inline-formula><tex-math notation="LaTeX">$\mu$</tex-math></inline-formula>-strongly convex and <inline-formula><tex-math notation="LaTeX">$L$</tex-math></inline-formula>-smooth, we show that it achieves a linear convergence rate of <inline-formula><tex-math notation="LaTeX">$O([ 1 - C (\frac{\mu }{L})^{5/7} ]^t)$</tex-math></inline-formula>, where <inline-formula><tex-math notation="LaTeX">$\frac{L}{\mu }$</tex-math></inline-formula> is the condition number of the objective, and <inline-formula><tex-math notation="LaTeX">$C>0$</tex-math></inline-formula> is some constant that does not depend on <inline-formula><tex-math notation="LaTeX">$\frac{L}{\mu }$</tex-math></inline-formula>.

Na Li | Guannan Qu | Guannan Qu | Na Li

[1] Wei Shi,et al. Geometrically convergent distributed optimization with uncoordinated step-sizes , 2016, 2017 American Control Conference (ACC).

[2] Georgios B. Giannakis,et al. Consensus-Based Distributed Support Vector Machines , 2010, J. Mach. Learn. Res..

[3] Georgios B. Giannakis,et al. Distributed Spectrum Sensing for Cognitive Radio Networks by Exploiting Sparsity , 2010, IEEE Transactions on Signal Processing.

[4] Wei Shi,et al. Achieving Geometric Convergence for Distributed Optimization Over Time-Varying Graphs , 2016, SIAM J. Optim..

[5] John N. Tsitsiklis,et al. Parallel and distributed computation , 1989 .

[6] Annie I-An Chen,et al. Fast Distributed First-Order Methods , 2012 .

[7] Martin J. Wainwright,et al. Dual Averaging for Distributed Optimization: Convergence Analysis and Network Scaling , 2010, IEEE Transactions on Automatic Control.

[8] John S. Baras,et al. Performance Evaluation of the Consensus-Based Distributed Subgradient Method Under Random Communication Topologies , 2011, IEEE Journal of Selected Topics in Signal Processing.

[9] Carl D. Meyer,et al. Matrix Analysis and Applied Linear Algebra , 2000 .

[10] A. Ozdaglar,et al. Convergence analysis of distributed subgradient methods over random networks , 2008, 2008 46th Annual Allerton Conference on Communication, Control, and Computing.

[11] Chenguang Xi,et al. On the Linear Convergence of Distributed Optimization over Directed Graphs , 2015, 1510.02149.

[12] John N. Tsitsiklis,et al. Convergence Speed in Distributed Consensus and Averaging , 2009, SIAM J. Control. Optim..

[13] Gesualdo Scutari,et al. Distributed nonconvex optimization over networks , 2015, 2015 IEEE 6th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP).

[14] Sonia Martínez,et al. On Distributed Convex Optimization Under Inequality and Equality Constraints , 2010, IEEE Transactions on Automatic Control.

[15] Alan M. Frieze,et al. Random graphs , 2006, SODA '06.

[16] Darinka Dentcheva,et al. An augmented Lagrangian method for distributed optimization , 2014, Mathematical Programming.

[17] Asuman E. Ozdaglar,et al. Distributed multi-agent optimization with state-dependent communication , 2010, Math. Program..

[18] Qing Ling,et al. EXTRA: An Exact First-Order Algorithm for Decentralized Consensus Optimization , 2014, 1404.6264.

[19] Dimitri P. Bertsekas,et al. Nonlinear Programming , 1997 .

[20] Xiangfeng Wang,et al. Multi-Agent Distributed Optimization via Inexact Consensus ADMM , 2014, IEEE Transactions on Signal Processing.

[21] Yurii Nesterov,et al. First-order methods of smooth convex optimization with inexact oracle , 2013, Mathematical Programming.

[22] Qing Ling,et al. On the Linear Convergence of the ADMM in Decentralized Consensus Optimization , 2013, IEEE Transactions on Signal Processing.

[23] Asuman E. Ozdaglar,et al. Distributed Subgradient Methods for Multi-Agent Optimization , 2009, IEEE Transactions on Automatic Control.

[24] John N. Tsitsiklis,et al. Distributed Asynchronous Deterministic and Stochastic Gradient Optimization Algorithms , 1984, 1984 American Control Conference.

[25] José M. F. Moura,et al. Fast Distributed Gradient Methods , 2011, IEEE Transactions on Automatic Control.

[26] Wei Shi,et al. A Push-Pull Gradient Method for Distributed Optimization in Networks , 2018, 2018 IEEE Conference on Decision and Control (CDC).

[27] Zeyuan Allen-Zhu,et al. Katyusha: the first direct acceleration of stochastic gradient methods , 2016, J. Mach. Learn. Res..

[28] Angelia Nedic,et al. Stochastic Gradient-Push for Strongly Convex Functions on Time-Varying Directed Graphs , 2014, IEEE Transactions on Automatic Control.

[29] Usman A. Khan,et al. ADD-OPT: Accelerated Distributed Directed Optimization , 2016, IEEE Transactions on Automatic Control.

[30] Na Li,et al. Accelerated Distributed Nesterov Gradient Descent for smooth and strongly convex functions , 2016, 2016 54th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[31] Björn Johansson,et al. On Distributed Optimization in Networked Systems , 2008 .

[32] Wotao Yin,et al. ExtraPush for Convex Smooth Decentralized Optimization over Directed Networks , 2015, ArXiv.

[33] Charles R. Johnson,et al. Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[34] Laurent Massoulié,et al. Optimal Algorithms for Smooth and Strongly Convex Distributed Optimization in Networks , 2017, ICML.

[35] Y. Nesterov,et al. First-order methods with inexact oracle: the strongly convex case , 2013 .

[36] Kristin L. Sainani,et al. Logistic Regression , 2014, PM & R : the journal of injury, function, and rehabilitation.

[37] Angelia Nedic,et al. Distributed optimization over time-varying directed graphs , 2013, 52nd IEEE Conference on Decision and Control.

[38] Angelia Nedic,et al. Distributed stochastic gradient tracking methods , 2018, Mathematical Programming.

[39] Asuman E. Ozdaglar,et al. A fast distributed proximal-gradient method , 2012, 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[40] Lihua Xie,et al. Augmented distributed gradient methods for multi-agent optimization under uncoordinated constant stepsizes , 2015, 2015 54th IEEE Conference on Decision and Control (CDC).

[41] Usman A. Khan,et al. A Linear Algorithm for Optimization Over Directed Graphs With Geometric Convergence , 2018, IEEE Control Systems Letters.

[42] Angelia Nedic,et al. Distributed Stochastic Subgradient Projection Algorithms for Convex Optimization , 2008, J. Optim. Theory Appl..

[43] Sébastien Bubeck,et al. Convex Optimization: Algorithms and Complexity , 2014, Found. Trends Mach. Learn..

[44] Reza Olfati-Saber,et al. Consensus and Cooperation in Networked Multi-Agent Systems , 2007, Proceedings of the IEEE.

[45] Yi Zhou,et al. Communication-efficient algorithms for decentralized and stochastic optimization , 2017, Mathematical Programming.

[46] Na Li,et al. Harnessing smoothness to accelerate distributed optimization , 2016, 2016 IEEE 55th Conference on Decision and Control (CDC).

[47] Alex Olshevsky,et al. Linear Time Average Consensus on Fixed Graphs and Implications for Decentralized Optimization and Multi-Agent Control , 2014, 1411.4186.

[48] Yurii Nesterov,et al. Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[49] Asuman E. Ozdaglar,et al. Distributed Alternating Direction Method of Multipliers , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).

[50] Gesualdo Scutari,et al. NEXT: In-Network Nonconvex Optimization , 2016, IEEE Transactions on Signal and Information Processing over Networks.

[51] John N. Tsitsiklis,et al. Parallel and distributed computation , 1989 .