Linearly Convergent Algorithm with Variance Reduction for Distributed Stochastic Optimization

This paper considers a distributed stochastic strongly convex optimization, where agents over a network aim to cooperatively minimize the average of all agents’ local cost functions. Due to the stochasticity of gradient estimation and distributedness of local objective, fast linearly convergent distributed algorithms have not been achieved yet. This work proposes a novel distributed stochastic gradient tracking algorithm with variance reduction, where the local gradients are estimated by an increasing batch-size of sampled gradients. With an undirected connected communication graph and a geometrically increasing batch-size, the iterates are shown to converge in mean to the optimal solution at a geometric rate (achieving linear convergence). The iteration, communication, and oracle complexity for obtaining an -optimal solution are established as well. Particulary, the communication complexity is $\mathcal{O}\left( {\ln \left( {1/ \in } \right)} \right)$ while the oracle complexity (number of sampled gradients) is $\mathcal{O}\left( {1/{ \in ^2}} \right)$, which is of the same order as that of centralized approaches. Hence, the proposed scheme is communication-efficient without requiring extra sampled gradients. Numerical simulations are given to demonstrate the theoretic results.

[1]  Robert Nowak,et al.  Distributed optimization in sensor networks , 2004, Third International Symposium on Information Processing in Sensor Networks, 2004. IPSN 2004.

[2]  Yurii Nesterov,et al.  Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[3]  Ziyang Meng,et al.  A survey of distributed optimization , 2019, Annu. Rev. Control..

[4]  Han-Fu Chen,et al.  Asymptotic Properties of Primal-Dual Algorithm for Distributed Stochastic Optimization over Random Networks with Imperfect Communications , 2016, SIAM J. Control. Optim..

[5]  Guanghui Wen,et al.  Economic power dispatch in smart grids: a framework for distributed optimization and consensus dynamics , 2017, Science China Information Sciences.

[6]  Feng Liu,et al.  Initialization-free distributed algorithms for optimal resource allocation with feasibility constraints and application to economic dispatch of power systems , 2015, Autom..

[7]  Anit Kumar Sahu,et al.  Convergence Rates for Distributed Stochastic Optimization Over Random Networks , 2018, 2018 IEEE Conference on Decision and Control (CDC).

[8]  Tong Zhang,et al.  Accelerating Stochastic Gradient Descent using Predictive Variance Reduction , 2013, NIPS.

[9]  Asuman E. Ozdaglar,et al.  Distributed Subgradient Methods for Multi-Agent Optimization , 2009, IEEE Transactions on Automatic Control.

[10]  Daniel W. C. Ho,et al.  Optimal distributed stochastic mirror descent for strongly convex optimization , 2016, Autom..

[11]  Mark W. Schmidt,et al.  A Stochastic Gradient Method with an Exponential Convergence Rate for Finite Training Sets , 2012, NIPS.

[12]  Ali H. Sayed,et al.  Stability and Performance Limits of Adaptive Primal-Dual Networks , 2014, IEEE Transactions on Signal Processing.

[13]  Han-Fu Chen,et al.  Primal-dual algorithm for distributed constrained optimization , 2015, Syst. Control. Lett..

[14]  Usman A. Khan,et al.  A Linear Algorithm for Optimization Over Directed Graphs With Geometric Convergence , 2018, IEEE Control Systems Letters.

[15]  Angelia Nedic,et al.  Distributed Stochastic Subgradient Projection Algorithms for Convex Optimization , 2008, J. Optim. Theory Appl..

[16]  João M. F. Xavier,et al.  D-ADMM: A Communication-Efficient Distributed Algorithm for Separable Optimization , 2012, IEEE Transactions on Signal Processing.

[17]  Angelia Nedic,et al.  A Distributed Stochastic Gradient Tracking Method , 2018, 2018 IEEE Conference on Decision and Control (CDC).

[18]  Wei Shi,et al.  Achieving Geometric Convergence for Distributed Optimization Over Time-Varying Graphs , 2016, SIAM J. Optim..

[19]  Yiguang Hong,et al.  Distributed regression estimation with incomplete data in multi-agent networks , 2018, Science China Information Sciences.

[20]  Feng Liu,et al.  Distributed gradient algorithm for constrained optimization with application to load sharing in power systems , 2015, Syst. Control. Lett..

[21]  Angelia Nedic,et al.  Distributed Asynchronous Constrained Stochastic Optimization , 2011, IEEE Journal of Selected Topics in Signal Processing.

[22]  Wei Zeng,et al.  Cooperative deterministic learning control for a group of homogeneous nonlinear uncertain robot manipulators , 2018, Science China Information Sciences.

[23]  Daniel Pérez Palomar,et al.  A tutorial on decomposition methods for network utility maximization , 2006, IEEE Journal on Selected Areas in Communications.

[24]  Jie Chen,et al.  An optimization-based shared control framework with applications in multi-robot systems , 2017, Science China Information Sciences.

[25]  Francis Bach,et al.  SAGA: A Fast Incremental Gradient Method With Support for Non-Strongly Convex Composite Objectives , 2014, NIPS.

[26]  Ali H. Sayed,et al.  Adaptive Networks , 2014, Proceedings of the IEEE.

[27]  Wei Shi,et al.  A Push-Pull Gradient Method for Distributed Optimization in Networks , 2018, 2018 IEEE Conference on Decision and Control (CDC).

[28]  Stephen P. Boyd,et al.  Fast linear iterations for distributed averaging , 2003, 42nd IEEE International Conference on Decision and Control (IEEE Cat. No.03CH37475).

[29]  Tamer Başar,et al.  Stochastic Subgradient Algorithms for Strongly Convex Optimization Over Distributed Networks , 2014, IEEE Transactions on Network Science and Engineering.

[30]  Anna Scaglione,et al.  Distributed Constrained Optimization by Consensus-Based Primal-Dual Perturbation Method , 2013, IEEE Transactions on Automatic Control.

[31]  Angelia Nedic,et al.  Stochastic Gradient-Push for Strongly Convex Functions on Time-Varying Directed Graphs , 2014, IEEE Transactions on Automatic Control.