论文信息 - A Distributed Stochastic Gradient Tracking Method

A Distributed Stochastic Gradient Tracking Method

In this paper, we study the problem of distributed multi-agent optimization over a network, where each agent possesses a local cost function that is smooth and strongly convex. The global objective is to find a common solution that minimizes the average of all cost functions. Assuming agents only have access to unbiased estimates of the gradients of their local cost functions, we consider a distributed stochastic gradient tracking method. We show that, in expectation, the iterates generated by each agent are attracted to a neighborhood of the optimal solution, where they accumulate exponentially fast (under a constant step size choice). More importantly, the limiting (expected) error bounds on the distance of the iterates from the optimal solution decrease with the network size, which is a comparable performance to a centralized stochastic gradient algorithm. Numerical examples further demonstrate the effectiveness of the method.

Angelia Nedic | Shi Pu | A. Nedić | Shi Pu

[1] Wei Shi,et al. Achieving Geometric Convergence for Distributed Optimization Over Time-Varying Graphs , 2016, SIAM J. Optim..

[2] Shi Pu,et al. A Flocking-Based Approach for Distributed Stochastic Optimization , 2018, Oper. Res..

[3] J. Kiefer,et al. Stochastic Estimation of the Maximum of a Regression Function , 1952 .

[4] Wei Zhang,et al. Can Decentralized Algorithms Outperform Centralized Algorithms? A Case Study for Decentralized Parallel Stochastic Gradient Descent , 2017, NIPS.

[5] Asuman E. Ozdaglar,et al. Constrained Consensus and Optimization in Multi-Agent Networks , 2008, IEEE Transactions on Automatic Control.

[6] Yi Zhou,et al. Communication-efficient algorithms for decentralized and stochastic optimization , 2017, Mathematical Programming.

[7] Ali H. Sayed,et al. Adaptive Networks , 2014, Proceedings of the IEEE.

[8] Robert E. Shannon,et al. Design and analysis of simulation experiments , 1978, WSC '78.

[9] Wei Shi,et al. A Push-Pull Gradient Method for Distributed Optimization in Networks , 2018, 2018 IEEE Conference on Decision and Control (CDC).

[10] Angelia Nedić,et al. Fast Convergence Rates for Distributed Non-Bayesian Learning , 2015, IEEE Transactions on Automatic Control.

[11] Angelia Nedic,et al. Distributed Stochastic Subgradient Projection Algorithms for Convex Optimization , 2008, J. Optim. Theory Appl..

[12] Asuman E. Ozdaglar,et al. Distributed Subgradient Methods for Multi-Agent Optimization , 2009, IEEE Transactions on Automatic Control.

[13] Gesualdo Scutari,et al. NEXT: In-Network Nonconvex Optimization , 2016, IEEE Transactions on Signal and Information Processing over Networks.

[14] H. Kushner,et al. Stochastic Approximation and Recursive Algorithms and Applications , 2003 .

[15] Slawomir Stanczak,et al. A Distributed Subgradient Method for Dynamic Convex Optimization Problems Under Noisy Information Exchange , 2013, IEEE Journal of Selected Topics in Signal Processing.

[16] John N. Tsitsiklis,et al. Distributed Asynchronous Deterministic and Stochastic Gradient Optimization Algorithms , 1984, 1984 American Control Conference.

[17] Gonzalo Mateos,et al. Proximal-Gradient Algorithms for Tracking Cascades Over Social Networks , 2014, IEEE Journal of Selected Topics in Signal Processing.

[18] Na Li,et al. Harnessing smoothness to accelerate distributed optimization , 2016, 2016 IEEE 55th Conference on Decision and Control (CDC).

[19] Gonzalo Mateos,et al. Distributed Recursive Least-Squares: Stability and Performance Analysis , 2011, IEEE Transactions on Signal Processing.

[20] R. Srikant,et al. Distributed Learning Algorithms for Spectrum Sharing in Spatial Random Access Wireless Networks , 2015, IEEE Transactions on Automatic Control.

[21] R. Srikant,et al. On projected stochastic gradient descent algorithm with weighted averaging for least squares regression , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[22] H. Robbins. A Stochastic Approximation Method , 1951 .

[23] Asuman E. Ozdaglar,et al. Distributed Subgradient Methods for Convex Optimization Over Random Networks , 2011, IEEE Transactions on Automatic Control.

[24] Shi Pu,et al. Swarming for Faster Convergence in Stochastic Optimization , 2018, SIAM J. Control. Optim..

[25] Tingwen Huang,et al. Cooperative Distributed Optimization in Multiagent Networks With Delays , 2015, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[26] Zongli Lin,et al. Noise Reduction by Swarming in Social Foraging , 2016, IEEE Transactions on Automatic Control.

[27] Angelia Nedic,et al. Distributed Asynchronous Constrained Stochastic Optimization , 2011, IEEE Journal of Selected Topics in Signal Processing.

[28] Angelia Nedic,et al. Distributed stochastic gradient tracking methods , 2018, Mathematical Programming.

[29] Alexander I. J. Forrester,et al. Multi-fidelity optimization via surrogate modelling , 2007, Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[30] Qing Ling,et al. EXTRA: An Exact First-Order Algorithm for Decentralized Consensus Optimization , 2014, 1404.6264.

[31] Ali H. Sayed,et al. Diffusion Adaptation Strategies for Distributed Optimization and Learning Over Networks , 2011, IEEE Transactions on Signal Processing.

[32] Ali H. Sayed,et al. Adaptive Penalty-Based Distributed Stochastic Convex Optimization , 2013, IEEE Transactions on Signal Processing.

[33] José M. F. Moura,et al. Fast Distributed Gradient Methods , 2011, IEEE Transactions on Automatic Control.

[34] Sonia Martínez,et al. Distributed convex optimization via continuous-time coordination algorithms with discrete-time communication , 2014, Autom..

[35] Zhuoran Yang,et al. Multi-Agent Reinforcement Learning via Double Averaging Primal-Dual Optimization , 2018, NeurIPS.