Asynchronous Distributed Optimization Via Randomized Dual Proximal Gradient

In this paper we consider distributed optimization problems in which the cost function is separable, i.e., a sum of possibly non-smooth functions all sharing a common variable, and can be split into a strongly convex term and a convex one. The second term is typically used to encode constraints or to regularize the solution. We propose a class of distributed optimization algorithms based on proximal gradient methods applied to the dual problem. We show that, by choosing suitable primal variable copies, the dual problem is itself separable when written in terms of conjugate functions, and the dual variables can be stacked into non-overlapping blocks associated to the computing nodes. We first show that a weighted proximal gradient on the dual function leads to a synchronous distributed algorithm with local dual proximal gradient updates at each node. Then, as main paper contribution, we develop asynchronous versions of the algorithm in which the node updates are triggered by local timers without any global iteration counter. The algorithms are shown to be proper randomized block-coordinate proximal gradient updates on the dual function.

[1]  Giuseppe Notarstefano,et al.  Distributed Abstract Optimization via Constraints Consensus: Theory and Applications , 2011, IEEE Transactions on Automatic Control.

[2]  Ion Necoara,et al.  Parallel Random Coordinate Descent Method for Composite Minimization: Convergence Analysis and Error Bounds , 2016, SIAM J. Optim..

[3]  Soummya Kar,et al.  Convergence Rate Analysis of Distributed Gossip (Linear Parameter) Estimation: Fundamental Limits and Tradeoffs , 2010, IEEE Journal of Selected Topics in Signal Processing.

[4]  Pascal Bianchi,et al.  A stochastic coordinate descent primal-dual algorithm and applications , 2014, 2014 IEEE International Workshop on Machine Learning for Signal Processing (MLSP).

[5]  Angelia Nedic,et al.  Distributed Random Projection Algorithm for Convex Optimization , 2012, IEEE Journal of Selected Topics in Signal Processing.

[6]  Gareth M. James,et al.  The Constrained Lasso , 2012 .

[7]  Marc Teboulle,et al.  A fast dual proximal gradient algorithm for convex minimization and applications , 2014, Oper. Res. Lett..

[8]  Giuseppe Notarstefano,et al.  A Polyhedral Approximation Framework for Convex and Robust Distributed Optimization , 2013, IEEE Transactions on Automatic Control.

[9]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[10]  Dragos N. Clipici,et al.  Parallel coordinate descent methods for composite minimization , 2013 .

[11]  Yurii Nesterov,et al.  Efficiency of Coordinate Descent Methods on Huge-Scale Optimization Problems , 2012, SIAM J. Optim..

[12]  Asuman E. Ozdaglar,et al.  On the O(1=k) convergence of asynchronous distributed alternating Direction Method of Multipliers , 2013, 2013 IEEE Global Conference on Signal and Information Processing.

[13]  Yurii Nesterov,et al.  Gradient methods for minimizing composite functions , 2012, Mathematical Programming.

[14]  S KiaSolmaz,et al.  Distributed convex optimization via continuous-time coordination algorithms with discrete-time communication , 2015 .

[15]  Stephen P. Boyd,et al.  Proximal Algorithms , 2013, Found. Trends Optim..

[16]  Asuman E. Ozdaglar,et al.  A fast distributed proximal-gradient method , 2012, 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[17]  Qing Ling,et al.  EXTRA: An Exact First-Order Algorithm for Decentralized Consensus Optimization , 2014, 1404.6264.

[18]  Hao Xu,et al.  The generalized lasso is reducible to a subspace constrained lasso , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[19]  Pascal Bianchi,et al.  Explicit Convergence Rate of a Distributed Alternating Direction Method of Multipliers , 2013, IEEE Transactions on Automatic Control.

[20]  Damiano Varagnolo,et al.  Asynchronous Newton-Raphson Consensus for Distributed Convex Optimization , 2012 .

[21]  Ion Necoara,et al.  Random Coordinate Descent Algorithms for Multi-Agent Convex Optimization Over Networks , 2013, IEEE Transactions on Automatic Control.

[22]  Angelia Nedic,et al.  Distributed Optimization Over Time-Varying Directed Graphs , 2015, IEEE Trans. Autom. Control..

[23]  Peter Richtárik,et al.  Iteration complexity of randomized block-coordinate descent methods for minimizing a composite function , 2011, Mathematical Programming.

[24]  Heinz H. Bauschke,et al.  Convex Analysis and Monotone Operator Theory in Hilbert Spaces , 2011, CMS Books in Mathematics.

[25]  Michael G. Rabbat,et al.  Consensus-based distributed optimization: Practical issues and applications in large-scale machine learning , 2012, 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[26]  Stephen P. Boyd,et al.  A Splitting Method for Optimal Control , 2013, IEEE Transactions on Control Systems Technology.

[27]  Francisco Facchinei,et al.  Parallel Selective Algorithms for Nonconvex Big Data Optimization , 2014, IEEE Transactions on Signal Processing.

[28]  José M. F. Moura,et al.  Convergence Rates of Distributed Nesterov-Like Gradient Methods on Random Networks , 2013, IEEE Transactions on Signal Processing.

[29]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[30]  Giuseppe Notarstefano,et al.  Randomized dual proximal gradient for large-scale distributed optimization , 2015, 2015 54th IEEE Conference on Decision and Control (CDC).

[31]  Sonia Martínez,et al.  Distributed convex optimization via continuous-time coordination algorithms with discrete-time communication , 2014, Autom..

[32]  Bahman Gharesifard,et al.  Distributed subgradient-push online convex optimization on time-varying directed graphs , 2014, 2014 52nd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[33]  Peter Richtárik,et al.  Parallel coordinate descent methods for big data optimization , 2012, Mathematical Programming.