Optimal and Practical Algorithms for Smooth and Strongly Convex Decentralized Optimization

We consider the task of decentralized minimization of the sum of smooth strongly convex functions stored across the nodes a network. For this problem, lower bounds on the number of gradient computations and the number of communication rounds required to achieve $\varepsilon$ accuracy have recently been proven. We propose two new algorithms for this decentralized optimization problem and equip them with complexity guarantees. We show that our first method is optimal both in terms of the number of communication rounds and in terms of the number of gradient computations. Unlike existing optimal algorithms, our algorithm does not rely on the expensive evaluation of dual gradients. Our second algorithm is optimal in terms of the number of communication rounds, without a logarithmic factor. Our approach relies on viewing the two proposed algorithms as accelerated variants of the Forward Backward algorithm to solve monotone inclusions associated with the decentralized optimization problem. We also verify the efficacy of our methods against state-of-the-art algorithms through numerical experiments.

[1]  Ying Sun,et al.  Distributed Algorithms for Composite Optimization: Unified and Tight Convergence Analysis , 2020, ArXiv.

[2]  Angelia Nedic,et al.  A Dual Approach for Optimal Algorithms in Distributed Optimization over Networks , 2018, 2020 Information Theory and Applications Workshop (ITA).

[3]  Jennifer A. Scott,et al.  Chebyshev acceleration of iterative refinement , 2014, Numerical Algorithms.

[4]  Heinz H. Bauschke,et al.  Convex Analysis and Monotone Operator Theory in Hilbert Spaces , 2011, CMS Books in Mathematics.

[5]  Laurent Massoulié,et al.  Optimal Algorithms for Smooth and Strongly Convex Distributed Optimization in Networks , 2017, ICML.

[6]  Qing Ling,et al.  EXTRA: An Exact First-Order Algorithm for Decentralized Consensus Optimization , 2014, 1404.6264.

[7]  Peter Richtárik,et al.  Federated Learning: Strategies for Improving Communication Efficiency , 2016, ArXiv.

[8]  Na Li,et al.  Accelerated Distributed Nesterov Gradient Descent , 2017, IEEE Transactions on Automatic Control.

[9]  Laurent Massoulié,et al.  Optimal Algorithms for Non-Smooth Distributed Optimization in Networks , 2018, NeurIPS.

[10]  Y. Nesterov A method for solving the convex programming problem with convergence rate O(1/k^2) , 1983 .

[11]  A. Gasnikov,et al.  Decentralized and Parallelized Primal and Dual Accelerated Methods for Stochastic Convex Programming Problems , 2019, 1904.09015.

[12]  Antonin Chambolle,et al.  A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging , 2011, Journal of Mathematical Imaging and Vision.

[13]  Zhouchen Lin,et al.  A Sharp Convergence Rate Analysis for Distributed Accelerated Gradient Methods , 2018, 1810.01053.

[14]  Zaïd Harchaoui,et al.  Catalyst Acceleration for First-order Convex Optimization: from Theory to Practice , 2017, J. Mach. Learn. Res..

[15]  Ali H. Sayed,et al.  Decentralized Proximal Gradient Algorithms With Linear Convergence Rates , 2019, IEEE Transactions on Automatic Control.

[16]  I. Loris,et al.  On a generalization of the iterative soft-thresholding algorithm for the case of non-separable penalty , 2011, 1104.1087.

[17]  Xiaoqun Zhang,et al.  A primal–dual fixed point algorithm for convex separable minimization with applications to image restoration , 2013 .

[18]  Laurent Condat,et al.  Proximal splitting algorithms: Relax them all! , 2019 .

[19]  Anit Kumar Sahu,et al.  Federated Learning: Challenges, Methods, and Future Directions , 2019, IEEE Signal Processing Magazine.

[20]  Marc Teboulle,et al.  A simple algorithm for a class of nonsmooth convex-concave saddle-point problems , 2015, Oper. Res. Lett..

[21]  Zhouchen Lin,et al.  Revisiting EXTRA for Smooth Distributed Optimization , 2020, SIAM J. Optim..

[22]  Blaise Agüera y Arcas,et al.  Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[23]  Laurent Condat,et al.  Dualize, Split, Randomize: Fast Nonsmooth Optimization Algorithms , 2020, ArXiv.

[24]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..