A Distributed Stochastic Gradient Tracking Method

In this paper, we study the problem of distributed multi-agent optimization over a network, where each agent possesses a local cost function that is smooth and strongly convex. The global objective is to find a common solution that minimizes the average of all cost functions. Assuming agents only have access to unbiased estimates of the gradients of their local cost functions, we consider a distributed stochastic gradient tracking method. We show that, in expectation, the iterates generated by each agent are attracted to a neighborhood of the optimal solution, where they accumulate exponentially fast (under a constant step size choice). More importantly, the limiting (expected) error bounds on the distance of the iterates from the optimal solution decrease with the network size, which is a comparable performance to a centralized stochastic gradient algorithm. Numerical examples further demonstrate the effectiveness of the method.

[1]  Wei Shi,et al.  Achieving Geometric Convergence for Distributed Optimization Over Time-Varying Graphs , 2016, SIAM J. Optim..

[2]  Shi Pu,et al.  A Flocking-Based Approach for Distributed Stochastic Optimization , 2018, Oper. Res..

[3]  J. Kiefer,et al.  Stochastic Estimation of the Maximum of a Regression Function , 1952 .

[4]  Wei Zhang,et al.  Can Decentralized Algorithms Outperform Centralized Algorithms? A Case Study for Decentralized Parallel Stochastic Gradient Descent , 2017, NIPS.

[5]  Asuman E. Ozdaglar,et al.  Constrained Consensus and Optimization in Multi-Agent Networks , 2008, IEEE Transactions on Automatic Control.

[6]  Yi Zhou,et al.  Communication-efficient algorithms for decentralized and stochastic optimization , 2017, Mathematical Programming.

[7]  Ali H. Sayed,et al.  Adaptive Networks , 2014, Proceedings of the IEEE.

[8]  Robert E. Shannon,et al.  Design and analysis of simulation experiments , 1978, WSC '78.

[9]  Wei Shi,et al.  A Push-Pull Gradient Method for Distributed Optimization in Networks , 2018, 2018 IEEE Conference on Decision and Control (CDC).

[10]  Angelia Nedić,et al.  Fast Convergence Rates for Distributed Non-Bayesian Learning , 2015, IEEE Transactions on Automatic Control.

[11]  Angelia Nedic,et al.  Distributed Stochastic Subgradient Projection Algorithms for Convex Optimization , 2008, J. Optim. Theory Appl..

[12]  Asuman E. Ozdaglar,et al.  Distributed Subgradient Methods for Multi-Agent Optimization , 2009, IEEE Transactions on Automatic Control.

[13]  Gesualdo Scutari,et al.  NEXT: In-Network Nonconvex Optimization , 2016, IEEE Transactions on Signal and Information Processing over Networks.

[14]  H. Kushner,et al.  Stochastic Approximation and Recursive Algorithms and Applications , 2003 .

[15]  Slawomir Stanczak,et al.  A Distributed Subgradient Method for Dynamic Convex Optimization Problems Under Noisy Information Exchange , 2013, IEEE Journal of Selected Topics in Signal Processing.

[16]  John N. Tsitsiklis,et al.  Distributed Asynchronous Deterministic and Stochastic Gradient Optimization Algorithms , 1984, 1984 American Control Conference.

[17]  Gonzalo Mateos,et al.  Proximal-Gradient Algorithms for Tracking Cascades Over Social Networks , 2014, IEEE Journal of Selected Topics in Signal Processing.

[18]  Na Li,et al.  Harnessing smoothness to accelerate distributed optimization , 2016, 2016 IEEE 55th Conference on Decision and Control (CDC).

[19]  Gonzalo Mateos,et al.  Distributed Recursive Least-Squares: Stability and Performance Analysis , 2011, IEEE Transactions on Signal Processing.

[20]  R. Srikant,et al.  Distributed Learning Algorithms for Spectrum Sharing in Spatial Random Access Wireless Networks , 2015, IEEE Transactions on Automatic Control.

[21]  R. Srikant,et al.  On projected stochastic gradient descent algorithm with weighted averaging for least squares regression , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[22]  H. Robbins A Stochastic Approximation Method , 1951 .

[23]  Asuman E. Ozdaglar,et al.  Distributed Subgradient Methods for Convex Optimization Over Random Networks , 2011, IEEE Transactions on Automatic Control.

[24]  Shi Pu,et al.  Swarming for Faster Convergence in Stochastic Optimization , 2018, SIAM J. Control. Optim..

[25]  Tingwen Huang,et al.  Cooperative Distributed Optimization in Multiagent Networks With Delays , 2015, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[26]  Zongli Lin,et al.  Noise Reduction by Swarming in Social Foraging , 2016, IEEE Transactions on Automatic Control.

[27]  Angelia Nedic,et al.  Distributed Asynchronous Constrained Stochastic Optimization , 2011, IEEE Journal of Selected Topics in Signal Processing.

[28]  Angelia Nedic,et al.  Distributed stochastic gradient tracking methods , 2018, Mathematical Programming.

[29]  Alexander I. J. Forrester,et al.  Multi-fidelity optimization via surrogate modelling , 2007, Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[30]  Qing Ling,et al.  EXTRA: An Exact First-Order Algorithm for Decentralized Consensus Optimization , 2014, 1404.6264.

[31]  Ali H. Sayed,et al.  Diffusion Adaptation Strategies for Distributed Optimization and Learning Over Networks , 2011, IEEE Transactions on Signal Processing.

[32]  Ali H. Sayed,et al.  Adaptive Penalty-Based Distributed Stochastic Convex Optimization , 2013, IEEE Transactions on Signal Processing.

[33]  José M. F. Moura,et al.  Fast Distributed Gradient Methods , 2011, IEEE Transactions on Automatic Control.

[34]  Sonia Martínez,et al.  Distributed convex optimization via continuous-time coordination algorithms with discrete-time communication , 2014, Autom..

[35]  Zhuoran Yang,et al.  Multi-Agent Reinforcement Learning via Double Averaging Primal-Dual Optimization , 2018, NeurIPS.