Achieving Linear Convergence in Distributed Asynchronous Multiagent Optimization

This article studies multiagent (convex and nonconvex) optimization over static digraphs. We propose a general distributed asynchronous algorithmic framework whereby 1) agents can update their local variables as well as communicate with their neighbors at any time, without any form of coordination; and 2) they can perform their local computations using (possibly) delayed, out-of-sync information from the other agents. Delays need not be known to the agent or obey any specific profile, and can also be time-varying (but bounded). The algorithm builds on a tracking mechanism that is robust against asynchrony (in the above sense), whose goal is to estimate locally the average of agents’ gradients. When applied to strongly convex functions, we prove that it converges at an R-linear (geometric) rate as long as the step-size is sufficiently small. A sublinear convergence rate is proved, when nonconvex problems and/or diminishing, uncoordinated step-sizes are considered. To the best of our knowledge, this is the first distributed algorithm with provable geometric convergence rate in such a general asynchronous setting. Preliminary numerical results demonstrate the efficacy of the proposed algorithm and validate our theoretical findings.

[1]  S. Kay Fundamentals of statistical signal processing: estimation theory , 1993 .

[2]  Stephen J. Wright,et al.  Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent , 2011, NIPS.

[3]  Gesualdo Scutari,et al.  Distributed nonconvex optimization over networks , 2015, 2015 IEEE 6th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP).

[4]  Yijun Huang,et al.  Asynchronous Parallel Stochastic Gradient for Nonconvex Optimization , 2015, NIPS.

[5]  Michael G. Rabbat,et al.  Asynchronous Gradient Push , 2018, IEEE Transactions on Automatic Control.

[6]  Michael G. Rabbat,et al.  Distributed consensus and optimization under communication delays , 2011, 2011 49th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[7]  Rahul Jain,et al.  Asynchronous Optimization Over Heterogeneous Networks Via Consensus ADMM , 2016, IEEE Transactions on Signal and Information Processing over Networks.

[8]  Ali H. Sayed,et al.  Decentralized Consensus Optimization With Asynchrony and Delays , 2016, IEEE Transactions on Signal and Information Processing over Networks.

[9]  Ming Yan,et al.  ARock: an Algorithmic Framework for Asynchronous Parallel Coordinate Updates , 2015, SIAM J. Sci. Comput..

[10]  Wei Shi,et al.  Achieving Geometric Convergence for Distributed Optimization Over Time-Varying Graphs , 2016, SIAM J. Optim..

[11]  Aryan Mokhtari,et al.  Decentralized Quasi-Newton Methods , 2016, IEEE Transactions on Signal Processing.

[12]  Ying Sun,et al.  Convergence Rate of Distributed Optimization Algorithms Based on Gradient Tracking , 2019, ArXiv.

[13]  Yongduan Song,et al.  Distributed multi-agent optimization subject to nonidentical constraints and communication delays , 2016, Autom..

[14]  Pascal Bianchi,et al.  A Coordinate Descent Primal-Dual Algorithm and Application to Distributed Asynchronous Optimization , 2014, IEEE Transactions on Automatic Control.

[15]  Wei Ren,et al.  Constrained Consensus in Unbalanced Networks With Communication Delays , 2014, IEEE Transactions on Automatic Control.

[16]  Lihua Xie,et al.  Augmented distributed gradient methods for multi-agent optimization under uncoordinated constant stepsizes , 2015, 2015 54th IEEE Conference on Decision and Control (CDC).

[17]  Zhao Yang Dong,et al.  Distributed mirror descent method for multi-agent optimization with delay , 2016, Neurocomputing.

[18]  R. Srikant,et al.  On the Convergence Rate of Distributed Gradient Methods for Finite-Sum Optimization under Communication Delays , 2017, Proc. ACM Meas. Anal. Comput. Syst..

[19]  Daniel Pérez Palomar,et al.  Distributed nonconvex multiagent optimization over time-varying networks , 2016, 2016 50th Asilomar Conference on Signals, Systems and Computers.

[20]  Gesualdo Scutari,et al.  Distributed nonconvex constrained optimization over time-varying digraphs , 2018, Mathematical Programming.

[21]  Heng Huang,et al.  Asynchronous Mini-Batch Gradient Descent with Variance Reduction for Non-Convex Optimization , 2017, AAAI.

[22]  Musa A. Mammadov,et al.  From Convex to Nonconvex: A Loss Function Analysis for Binary Classification , 2010, 2010 IEEE International Conference on Data Mining Workshops.

[23]  Bin Du,et al.  ASY-SONATA: Achieving Geometric Convergence for Distributed Asynchronous Optimization , 2018, ArXiv.

[24]  Ruggero Carli,et al.  Average Consensus with Asynchronous Updates and Unreliable Communication , 2017 .

[25]  Francisco Facchinei,et al.  Asynchronous Parallel Algorithms for Nonconvex Big-Data Optimization. Part II: Complexity and Numerical Results , 2017, 1701.04900.

[26]  Damiano Varagnolo,et al.  Newton-Raphson Consensus under asynchronous and lossy communications for peer-to-peer networks , 2017, 1707.09178.

[27]  Jiaqi Zhang,et al.  AsySPA: An Exact Asynchronous Algorithm for Convex Optimization Over Digraphs , 2018, IEEE Transactions on Automatic Control.

[28]  Nitin H. Vaidya,et al.  Robust Distributed Average Consensus via Exchange of Running Sums , 2016, IEEE Transactions on Automatic Control.

[29]  Vivek S. Borkar,et al.  Distributed Asynchronous Incremental Subgradient Methods , 2001 .

[30]  Tingwen Huang,et al.  Cooperative Distributed Optimization in Multiagent Networks With Delays , 2015, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[31]  Pascal Bianchi,et al.  Asynchronous distributed optimization using a randomized alternating direction method of multipliers , 2013, 52nd IEEE Conference on Decision and Control.

[32]  John N. Tsitsiklis,et al.  Distributed Asynchronous Deterministic and Stochastic Gradient Optimization Algorithms , 1984, 1984 American Control Conference.

[33]  John N. Tsitsiklis,et al.  Parallel and distributed computation , 1989 .

[34]  Gesualdo Scutari,et al.  NEXT: In-Network Nonconvex Optimization , 2016, IEEE Transactions on Signal and Information Processing over Networks.

[35]  Francisco Facchinei,et al.  Multi-Agent asynchronous nonconvex large-scale optimization , 2017, 2017 IEEE 7th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP).

[36]  Stephen J. Wright,et al.  Asynchronous Stochastic Coordinate Descent: Parallelism and Convergence Properties , 2014, SIAM J. Optim..

[37]  Asuman E. Ozdaglar,et al.  Convergence rate for consensus with delays , 2010, J. Glob. Optim..

[38]  Michael G. Rabbat,et al.  Distributed dual averaging for convex optimization under communication delays , 2012, 2012 American Control Conference (ACC).

[39]  Damiano Varagnolo,et al.  Multiagent Newton–Raphson Optimization Over Lossy Networks , 2019, IEEE Transactions on Automatic Control.

[40]  Ali H. Sayed,et al.  Asynchronous Adaptation and Learning Over Networks—Part I: Modeling and Stability Analysis , 2013, IEEE Transactions on Signal Processing.

[41]  Theodore S. Rappaport,et al.  Wireless communications - principles and practice , 1996 .

[42]  Ye Tian,et al.  ASY-SONATA: Achieving Linear Convergence in Distributed Asynchronous Multiagent Optimization , 2018, 2018 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[43]  Francisco Facchinei,et al.  Asynchronous parallel algorithms for nonconvex optimization , 2016, Mathematical Programming.

[44]  Johannes Gehrke,et al.  Gossip-based computation of aggregate information , 2003, 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings..

[45]  Asuman E. Ozdaglar,et al.  On the O(1=k) convergence of asynchronous distributed alternating Direction Method of Multipliers , 2013, 2013 IEEE Global Conference on Signal and Information Processing.

[46]  Angelia Nedic,et al.  Asynchronous Broadcast-Based Convex Optimization Over a Network , 2011, IEEE Transactions on Automatic Control.

[47]  Giuseppe Notarstefano,et al.  Asynchronous Distributed Optimization Via Randomized Dual Proximal Gradient , 2015, IEEE Transactions on Automatic Control.

[48]  Lihua Xie,et al.  Convergence of Asynchronous Distributed Gradient Methods Over Stochastic Networks , 2018, IEEE Transactions on Automatic Control.

[49]  Aryan Mokhtari,et al.  A Class of Parallel Doubly Stochastic Algorithms for Large-Scale Learning , 2016, J. Mach. Learn. Res..

[50]  R. Srikant,et al.  On the Convergence Rate of Distributed Gradient Methods for Finite-Sum Optimization under Communication Delays , 2017, Abstracts of the 2018 ACM International Conference on Measurement and Modeling of Computer Systems.

[51]  Thinh T. Doan,et al.  Impact of Communication Delays on the Convergence Rate of Distributed Optimization Algorithms , 2017, 1708.03277.