Push-Sum Distributed Dual Averaging for convex optimization

Recently there has been a significant amount of research on developing consensus based algorithms for distributed optimization motivated by applications that vary from large scale machine learning to wireless sensor networks. This work describes and proves convergence of a new algorithm called Push-Sum Distributed Dual Averaging which combines a recent optimization algorithm [1] with a push-sum consensus protocol [2]. As we discuss, the use of push-sum has significant advantages. Restricting to doubly stochastic consensus protocols is not required and convergence to the true average consensus is guaranteed without knowing the stationary distribution of the update matrix in advance. Furthermore, the communication semantics of just summing the incoming information make this algorithm truly asynchronous and allow a clean analysis when varying intercommunication intervals and communication delays are modelled. We include experiments in simulation and on a small cluster to complement the theoretical analysis.

[1]  J. A. Fill Eigenvalue bounds on convergence to stationarity for nonreversible markov chains , 1991 .

[2]  Jack Dongarra,et al.  MPI - The Complete Reference: Volume 1, The MPI Core , 1998 .

[3]  William Gropp,et al.  Mpi---the complete reference: volume 1 , 1998 .

[4]  Johannes Gehrke,et al.  Gossip-based computation of aggregate information , 2003, 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings..

[5]  E. Seneta Non-negative Matrices and Markov Chains , 2008 .

[6]  Mikael Johansson,et al.  A Randomized Incremental Subgradient Method for Distributed Optimization in Networked Systems , 2009, SIAM J. Optim..

[7]  Anand D. Sarwate,et al.  Broadcast Gossip Algorithms for Consensus , 2009, IEEE Transactions on Signal Processing.

[8]  Asuman E. Ozdaglar,et al.  Distributed Subgradient Methods for Multi-Agent Optimization , 2009, IEEE Transactions on Automatic Control.

[9]  Soummya Kar,et al.  Gossip Algorithms for Distributed Signal Processing , 2010, Proceedings of the IEEE.

[10]  John N. Tsitsiklis,et al.  Weighted Gossip: Distributed Averaging using non-doubly stochastic matrices , 2010, 2010 IEEE International Symposium on Information Theory.

[11]  Victor M. Preciado,et al.  On asymptotic consensus value in directed random networks , 2010, 49th IEEE Conference on Decision and Control (CDC).

[12]  J. Cortés,et al.  When does a digraph admit a doubly stochastic adjacency matrix? , 2010, Proceedings of the 2010 American Control Conference.

[13]  Angelia Nedic,et al.  Distributed Stochastic Subgradient Projection Algorithms for Convex Optimization , 2008, J. Optim. Theory Appl..

[14]  Michael G. Rabbat,et al.  Distributed consensus and optimization under communication delays , 2011, 2011 49th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[15]  Michael G. Rabbat,et al.  Distributed dual averaging for convex optimization under communication delays , 2012, 2012 American Control Conference (ACC).

[16]  Martin J. Wainwright,et al.  Dual Averaging for Distributed Optimization: Convergence Analysis and Network Scaling , 2010, IEEE Transactions on Automatic Control.