On the Convergence of Consensus Algorithms with Markovian noise and Gradient Bias

This paper presents a finite time convergence analysis for a decentralized stochastic approximation (SA) scheme. The scheme generalizes several algorithms for decentralized machine learning and multi-agent reinforcement learning. Our proof technique involves separating the iterates into their respective consensual parts and consensus error. The consensus error is bounded in terms of the stationarity of the consensual part, while the updates of the consensual part can be analyzed as a perturbed SA scheme. Under the Markovian noise and time varying communication graph assumptions, the decentralized SA scheme has an expected convergence rate of $\mathcal{O}(\log T/\sqrt T )$, where T is the iteration number, in terms of squared norms of gradient for nonlinear SA with smooth but non-convex cost function. This rate is comparable to the best known performances of SA in a centralized setting with a non-convex potential function.

[1]  R. Srikant,et al.  Finite-Time Error Bounds For Linear Stochastic Approximation and TD Learning , 2019, COLT.

[2]  H. Robbins A Stochastic Approximation Method , 1951 .

[3]  Pascal Bianchi,et al.  Performance of a Distributed Stochastic Approximation Algorithm , 2012, IEEE Transactions on Information Theory.

[4]  Ana Busic,et al.  Explicit Mean-Square Error Bounds for Monte-Carlo and Linear Stochastic Approximation , 2020, AISTATS.

[5]  Lam M. Nguyen,et al.  Finite-Time Analysis of Stochastic Gradient Descent under Markov Randomness , 2020, ArXiv.

[6]  Gesualdo Scutari,et al.  NEXT: In-Network Nonconvex Optimization , 2016, IEEE Transactions on Signal and Information Processing over Networks.

[7]  Xiangru Lian,et al.  D2: Decentralized Training over Decentralized Data , 2018, ICML.

[8]  Akhil Shetty,et al.  Non-asymptotic error bounds for constant stepsize stochastic approximation for tracking mobile agents , 2018, Mathematics of Control, Signals, and Systems.

[9]  Jalaj Bhandari,et al.  A Finite Time Analysis of Temporal Difference Learning With Linear Function Approximation , 2018, COLT.

[10]  Saeed Ghadimi,et al.  Stochastic First- and Zeroth-Order Methods for Nonconvex Stochastic Programming , 2013, SIAM J. Optim..

[11]  Songtao Lu,et al.  Distributed Learning in the Non-Convex World: From Batch to Streaming Data, and Beyond , 2020, ArXiv.

[12]  Asuman E. Ozdaglar,et al.  Distributed Subgradient Methods for Multi-Agent Optimization , 2009, IEEE Transactions on Automatic Control.

[13]  Qing Liao,et al.  Decentralized Markov Chain Gradient Descent , 2019, ArXiv.

[14]  Vivek S. Borkar,et al.  Nonlinear Gossip , 2016, SIAM J. Control. Optim..

[15]  Thinh T. Doan,et al.  Finite-Time Performance of Distributed Temporal Difference Learning with Linear Function Approximation , 2019, SIAM J. Math. Data Sci..

[16]  Sean P. Meyn,et al.  Optimal Matrix Momentum Stochastic Approximation and Applications to Q-learning , 2018 .

[17]  Sean P. Meyn,et al.  A Liapounov bound for solutions of the Poisson equation , 1996 .

[18]  Angelia Nedic,et al.  A Distributed Stochastic Gradient Tracking Method , 2018, 2018 IEEE Conference on Decision and Control (CDC).

[19]  Eric Moulines,et al.  Non-asymptotic Analysis of Biased Stochastic Approximation Scheme , 2019, COLT.

[20]  Anit Kumar Sahu,et al.  Distributed stochastic optimization with gradient tracking over strongly-connected networks , 2019, 2019 IEEE 58th Conference on Decision and Control (CDC).

[21]  Wei Zhang,et al.  Can Decentralized Algorithms Outperform Centralized Algorithms? A Case Study for Decentralized Parallel Stochastic Gradient Descent , 2017, NIPS.

[22]  Georgios B. Giannakis,et al.  Finite-Sample Analysis of Decentralized Temporal-Difference Learning with Linear Function Approximation , 2020, AISTATS.

[23]  Songtao Lu,et al.  GNSD: a Gradient-Tracking Based Nonconvex Stochastic Algorithm for Decentralized Optimization , 2019, 2019 IEEE Data Science Workshop (DSW).

[24]  G. Fort,et al.  Convergence of adaptive and interacting Markov chain Monte Carlo algorithms , 2011, 1203.3036.

[25]  Zhuoran Yang,et al.  Multi-Agent Reinforcement Learning via Double Averaging Primal-Dual Optimization , 2018, NeurIPS.

[26]  Ali H. Sayed,et al.  Distributed Learning in Non-Convex Environments—Part I: Agreement at a Linear Rate , 2019, IEEE Transactions on Signal Processing.

[27]  Wotao Yin,et al.  On Markov Chain Gradient Descent , 2018, NeurIPS.

[28]  Michael I. Jordan,et al.  Ergodic mirror descent , 2011, 2011 49th Annual Allerton Conference on Communication, Control, and Computing (Allerton).