An Enhanced Gradient-Tracking Bound for Distributed Online Stochastic Convex Optimization

—Gradient-tracking (GT) based decentralized methods have emerged as an effective and viable alternative method to decentralized (stochastic) gradient descent (DSGD) when solving distributed online stochastic optimization problems. Initial studies of GT methods implied that GT methods have worse network dependent rate than DSGD, contradicting experimental results. This dilemma has recently been resolved, and tighter rates for GT methods have been established, which improves upon DSGD. In this work, we establish more enhanced rates for GT methods under the online stochastic convex settings. We present an alternative approach for analyzing GT methods for convex problems and over static graphs. When compared to previous analyses, this approach allows us to establish enhanced network dependent rates.

[1]  Sulaiman A. Alghunaim,et al.  A Unified and Refined Convergence Analysis for Non-Convex Decentralized Learning , 2021, IEEE Transactions on Signal Processing.

[2]  U. Khan,et al.  A Fast Randomized Incremental Gradient Method for Decentralized Nonconvex Optimization , 2020, IEEE Transactions on Automatic Control.

[3]  G. Scutari,et al.  Distributed Optimization Based on Gradient Tracking Revisited: Enhancing Convergence Rate via Surrogation , 2020, SIAM J. Optim..

[4]  Soummya Kar,et al.  Fast Decentralized Nonconvex Finite-Sum Optimization with Recursive Variance Reduction , 2020, SIAM J. Optim..

[5]  Sebastian U. Stich,et al.  An Improved Analysis of Gradient Tracking for Decentralized Machine Learning , 2022, NeurIPS.

[6]  Sulaiman A. Alghunaim,et al.  Removing Data Heterogeneity Influence Enhances Network Topology Dependence of Decentralized SGD , 2021, J. Mach. Learn. Res..

[7]  Na Li,et al.  Distributed Zero-Order Algorithms for Nonconvex Multiagent Optimization , 2021, IEEE Transactions on Control of Network Systems.

[8]  Soummya Kar,et al.  An Improved Convergence Analysis for Decentralized Online Stochastic Non-Convex Optimization , 2020, IEEE Transactions on Signal Processing.

[9]  Ali H. Sayed,et al.  Decentralized Proximal Gradient Algorithms With Linear Convergence Rates , 2019, IEEE Transactions on Automatic Control.

[10]  A. Nedić,et al.  Push–Pull Gradient Methods for Distributed Optimization in Networks , 2018, IEEE Transactions on Automatic Control.

[11]  Angelia Nedic,et al.  Distributed stochastic gradient tracking methods , 2018, Mathematical Programming.

[12]  Haoran Sun,et al.  Improving the Sample and Communication Complexity for Decentralized Non-Convex Optimization: Joint Gradient Estimation and Tracking , 2020, ICML.

[13]  Martin Jaggi,et al.  A Unified Theory of Decentralized SGD with Changing Topology and Local Updates , 2020, ICML.

[14]  Yuejie Chi,et al.  Communication-Efficient Distributed Optimization in Networks with Gradient Tracking , 2019, AISTATS.

[15]  Ali H. Sayed,et al.  On the Influence of Bias-Correction on Distributed Stochastic Optimization , 2019, IEEE Transactions on Signal Processing.

[16]  Usman A. Khan,et al.  Optimization over time-varying directed graphs with row and column-stochastic matrices , 2018, 1810.07393.

[17]  Vyacheslav Kungurtsev,et al.  Second-order Guarantees of Distributed Gradient Algorithms , 2018, SIAM J. Optim..

[18]  Songtao Lu,et al.  GNSD: a Gradient-Tracking Based Nonconvex Stochastic Algorithm for Decentralized Optimization , 2019, 2019 IEEE Data Science Workshop (DSW).

[19]  Gesualdo Scutari,et al.  Distributed nonconvex constrained optimization over time-varying digraphs , 2018, Mathematical Programming.

[20]  Van Sy Mai,et al.  Linear Convergence in Optimization Over Directed Graphs With Row-Stochastic Matrices , 2016, IEEE Transactions on Automatic Control.

[21]  Wei Shi,et al.  Achieving Geometric Convergence for Distributed Optimization Over Time-Varying Graphs , 2016, SIAM J. Optim..

[22]  Na Li,et al.  Harnessing smoothness to accelerate distributed optimization , 2016, 2016 IEEE 55th Conference on Decision and Control (CDC).

[23]  Gesualdo Scutari,et al.  NEXT: In-Network Nonconvex Optimization , 2016, IEEE Transactions on Signal and Information Processing over Networks.

[24]  Qing Ling,et al.  On the Convergence of Decentralized Gradient Descent , 2013, SIAM J. Optim..

[25]  Lihua Xie,et al.  Augmented distributed gradient methods for multi-agent optimization under uncoordinated constant stepsizes , 2015, 2015 54th IEEE Conference on Decision and Control (CDC).

[26]  Wei Shi,et al.  EXTRA: An Exact First-Order Algorithm for Decentralized Consensus Optimization , 2014, SIAM J. Optim..

[27]  Ali H. Sayed,et al.  Diffusion LMS Strategies for Distributed Estimation , 2010, IEEE Transactions on Signal Processing.

[28]  Sonia Martínez,et al.  Discrete-time dynamic average consensus , 2010, Autom..

[29]  Angelia Nedic,et al.  Distributed Stochastic Subgradient Projection Algorithms for Convex Optimization , 2008, J. Optim. Theory Appl..

[30]  Asuman E. Ozdaglar,et al.  Distributed Subgradient Methods for Multi-Agent Optimization , 2009, IEEE Transactions on Automatic Control.

[31]  Ali H. Sayed,et al.  Diffusion Least-Mean Squares Over Adaptive Networks: Formulation and Performance Analysis , 2008, IEEE Transactions on Signal Processing.