Convergence Rates of Distributed Nesterov-Like Gradient Methods on Random Networks

We consider distributed optimization in random networks where N nodes cooperatively minimize the sum Σi=1N fi(x) of their individual convex costs. Existing literature proposes distributed gradient-like methods that are computationally cheap and resilient to link failures, but have slow convergence rates. In this paper, we propose accelerated distributed gradient methods that 1) are resilient to link failures; 2) computationally cheap; and 3) improve convergence rates over other gradient methods. We model the network by a sequence of independent, identically distributed random matrices {W(k)} drawn from the set of symmetric, stochastic matrices with positive diagonals. The network is connected on average and the cost functions are convex, differentiable, with Lipschitz continuous and bounded gradients. We design two distributed Nesterov-like gradient methods that modify the D-NG and D-NC methods that we proposed for static networks. We prove their convergence rates in terms of the expected optimality gap at the cost function. Let k and K be the number of per-node gradient evaluations and per-node communications, respectively. Then the modified D-NG achieves rates O(logk/k) and O(logK/ K), and the modified D-NC rates O(1/k2) and O(1/ K2-ξ), where ξ > 0 is arbitrarily small. For comparison, the standard distributed gradient method cannot do better than Ω(1/k2/3) and Ω(1/ K2/3), on the same class of cost functions (even for static networks). Simulation examples illustrate our analytical findings.

[1]  Y. Nesterov A method for solving the convex programming problem with convergence rate O(1/k^2) , 1983 .

[2]  Soummya Kar,et al.  Gossip Algorithms for Distributed Signal Processing , 2010, Proceedings of the IEEE.

[3]  Ali H. Sayed,et al.  Adaptive estimation algorithms over distributed networks , 2006 .

[4]  Stergios I. Roumeliotis,et al.  Consensus in Ad Hoc WSNs With Noisy Links—Part II: Distributed Estimation and Smoothing of Random Signals , 2008, IEEE Transactions on Signal Processing.

[5]  José M. F. Moura,et al.  Cooperative Convex Optimization in Networked Systems: Augmented Lagrangian Algorithms With Directed Gossip Communication , 2010, IEEE Transactions on Signal Processing.

[6]  José M. F. Moura,et al.  Weight Optimization for Consensus Algorithms With Correlated Switching Topology , 2009, IEEE Transactions on Signal Processing.

[7]  Angelia Nedic,et al.  Distributed Stochastic Subgradient Projection Algorithms for Convex Optimization , 2008, J. Optim. Theory Appl..

[8]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[9]  Sergios Theodoridis,et al.  A Sparsity Promoting Adaptive Algorithm for Distributed Learning , 2012, IEEE Transactions on Signal Processing.

[10]  Angelia Nedic,et al.  Asynchronous gossip algorithms for stochastic optimization , 2009, Proceedings of the 48h IEEE Conference on Decision and Control (CDC) held jointly with 2009 28th Chinese Control Conference.

[11]  Gonzalo Mateos,et al.  Distributed Sparse Linear Regression , 2010, IEEE Transactions on Signal Processing.

[12]  Léon Bottou,et al.  The Tradeoffs of Large Scale Learning , 2007, NIPS.

[13]  Angelia Nedic,et al.  Multiuser Optimization: Distributed Algorithms and Error Analysis , 2011, SIAM J. Optim..

[14]  Georgios B. Giannakis,et al.  Distributed Spectrum Sensing for Cognitive Radio Networks by Exploiting Sparsity , 2010, IEEE Transactions on Signal Processing.

[15]  Pascal Bianchi,et al.  Asynchronous distributed optimization using a randomized alternating direction method of multipliers , 2013, 52nd IEEE Conference on Decision and Control.

[16]  Karl Henrik Johansson,et al.  Subgradient methods and consensus algorithms for solving convex optimization problems , 2008, 2008 47th IEEE Conference on Decision and Control.

[17]  John S. Baras,et al.  Performance Evaluation of the Consensus-Based Distributed Subgradient Method Under Random Communication Topologies , 2011, IEEE Journal of Selected Topics in Signal Processing.

[18]  Ali H. Sayed,et al.  Distributed Detection Over Adaptive Networks Using Diffusion Adaptation , 2011, IEEE Transactions on Signal Processing.

[19]  R. Murray,et al.  Decentralized Multi-Agent Optimization via Dual Decomposition , 2011 .

[20]  Martin J. Wainwright,et al.  Dual Averaging for Distributed Optimization: Convergence Analysis and Network Scaling , 2010, IEEE Transactions on Automatic Control.

[21]  Stephen P. Boyd,et al.  Randomized gossip algorithms , 2006, IEEE Transactions on Information Theory.

[22]  A. Nedić,et al.  Asynchronous Gossip Algorithm for Stochastic Optimization: Constant Stepsize Analysis* , 2010 .

[23]  Asuman E. Ozdaglar,et al.  Distributed multi-agent optimization with state-dependent communication , 2010, Math. Program..

[24]  A. Ozdaglar,et al.  Convergence analysis of distributed subgradient methods over random networks , 2008, 2008 46th Annual Allerton Conference on Communication, Control, and Computing.

[25]  João M. F. Xavier,et al.  Distributed Basis Pursuit , 2010, IEEE Transactions on Signal Processing.

[26]  Asuman E. Ozdaglar,et al.  Constrained Consensus and Optimization in Multi-Agent Networks , 2008, IEEE Transactions on Automatic Control.

[27]  José M. F. Moura,et al.  Distributed Nesterov-like gradient algorithms , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).

[28]  Alfred O. Hero,et al.  A Convergent Incremental Gradient Method with a Constant Step Size , 2007, SIAM J. Optim..

[29]  Alejandro Ribeiro,et al.  Consensus in Ad Hoc WSNs With Noisy Links—Part I: Distributed Estimation of Deterministic Signals , 2008, IEEE Transactions on Signal Processing.

[30]  Ali H. Sayed,et al.  Diffusion strategies for adaptation and learning over networks: an examination of distributed strategies and network behavior , 2013, IEEE Signal Processing Magazine.

[31]  Alfred O. Hero,et al.  Energy-based sensor network source localization via projection onto convex sets , 2006, IEEE Trans. Signal Process..

[32]  Ali H. Sayed,et al.  Diffusion LMS Strategies for Distributed Estimation , 2010, IEEE Transactions on Signal Processing.

[33]  Asuman E. Ozdaglar,et al.  A fast distributed proximal-gradient method , 2012, 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[34]  Ali H. Sayed,et al.  Diffusion Adaptation Strategies for Distributed Optimization and Learning Over Networks , 2011, IEEE Transactions on Signal Processing.

[35]  João M. F. Xavier,et al.  Consensus and Products of Random Stochastic Matrices: Exact Rate for Convergence in Probability , 2012, IEEE Transactions on Signal Processing.

[36]  José M. F. Moura,et al.  Fast Distributed Gradient Methods , 2011, IEEE Transactions on Automatic Control.

[37]  Stephen P. Boyd,et al.  A scheme for robust distributed sensor fusion based on average consensus , 2005, IPSN 2005. Fourth International Symposium on Information Processing in Sensor Networks, 2005..

[38]  D. Donoho,et al.  Basis pursuit , 1994, Proceedings of 1994 28th Asilomar Conference on Signals, Systems and Computers.

[39]  José M. F. Moura,et al.  Distributed Detection via Gaussian Running Consensus: Large Deviations Asymptotic Analysis , 2011, IEEE Transactions on Signal Processing.

[40]  João M. F. Xavier,et al.  Basis Pursuit in sensor networks , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[41]  Sonia Martínez,et al.  On Distributed Convex Optimization Under Inequality and Equality Constraints , 2010, IEEE Transactions on Automatic Control.

[42]  Michael G. Rabbat,et al.  Distributed consensus and optimization under communication delays , 2011, 2011 49th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[43]  Asuman E. Ozdaglar,et al.  Distributed Subgradient Methods for Multi-Agent Optimization , 2009, IEEE Transactions on Automatic Control.

[44]  Soummya Kar,et al.  Distributed Parameter Estimation in Sensor Networks: Nonlinear Observation Models and Imperfect Communication , 2008, IEEE Transactions on Information Theory.

[45]  Alireza Tahbaz-Salehi,et al.  On consensus over random networks , 2006 .

[46]  Robert Nowak,et al.  Distributed optimization in sensor networks , 2004, Third International Symposium on Information Processing in Sensor Networks, 2004. IPSN 2004.