论文信息 - Decentralized Optimization Over Noisy, Rate-Constrained Networks: How We Agree By Talking About How We Disagree

Decentralized Optimization Over Noisy, Rate-Constrained Networks: How We Agree By Talking About How We Disagree

In decentralized optimization, multiple nodes in a network collaborate to minimize the sum of their local loss functions. The information exchange between nodes required for this task is often limited by network connectivity. We consider a generalization of this setting, in which communication is further hindered by (i) a finite data-rate constraint on the signal transmitted by any node, and (ii) an additive noise corrupting the signal received by any node. We develop a novel algorithm for this scenario: Decentralized Lazy Mirror Descent with Differential Exchanges (DLMD-DiffEx), which guarantees convergence of the local estimates to the optimal solution. A salient feature of DLMD-DiffEx is the introduction of additional proxy variables that are maintained by the nodes to account for the disagreement in their estimates due to channel noise and data-rate constraints. We investigate the performance of DLMD-DiffEx both from a theoretical perspective as well as through numerical evaluations.

[1] Michael I. Jordan,et al. CoCoA: A General Framework for Communication-Efficient Distributed Optimization , 2016, J. Mach. Learn. Res..

[2] John Vanderkooy,et al. A theory of nonsubtractive dither , 2000, IEEE Trans. Signal Process..

[3] Walid Saad,et al. Distributed Federated Learning for Ultra-Reliable Low-Latency Vehicular Communications , 2018, IEEE Transactions on Communications.

[4] Peter Richtárik,et al. Federated Learning: Strategies for Improving Communication Efficiency , 2016, ArXiv.

[5] José M. F. Moura,et al. Fast Distributed Gradient Methods , 2011, IEEE Transactions on Automatic Control.

[6] Yu Hen Hu,et al. Detection, classification, and tracking of targets , 2002, IEEE Signal Process. Mag..

[7] Zhengyuan Zhu,et al. Compressed Distributed Gradient Descent: Communication-Efficient Consensus over Networks , 2018, IEEE INFOCOM 2019 - IEEE Conference on Computer Communications.

[8] Li Huang,et al. LoAdaBoost: Loss-based AdaBoost federated machine learning with reduced computational complexity on IID and non-IID intensive care data , 2018, PloS one.

[9] John N. Tsitsiklis,et al. Problems in decentralized decision making and computation , 1984 .

[10] Usman A. Khan,et al. A Linear Algorithm for Optimization Over Directed Graphs With Geometric Convergence , 2018, IEEE Control Systems Letters.

[11] Michael G. Rabbat,et al. Distributed strongly convex optimization , 2012, 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[12] Soummya Kar,et al. Distributed Consensus Algorithms in Sensor Networks: Quantized Data and Random Link Failures , 2007, IEEE Transactions on Signal Processing.

[13] Angelia Nedic,et al. Distributed constrained optimization over noisy networks , 2010, 49th IEEE Conference on Decision and Control (CDC).

[14] Kamyar Azizzadenesheli,et al. signSGD: compressed optimisation for non-convex problems , 2018, ICML.

[15] Martin Jaggi,et al. Decentralized Stochastic Optimization and Gossip Algorithms with Compressed Communication , 2019, ICML.

[16] Martin J. Wainwright,et al. Dual Averaging for Distributed Optimization: Convergence Analysis and Network Scaling , 2010, IEEE Transactions on Automatic Control.

[17] Aryan Mokhtari,et al. An Exact Quantized Decentralized Gradient Descent Algorithm , 2018, IEEE Transactions on Signal Processing.

[18] Yiguang Hong,et al. Quantized Subgradient Algorithm and Data-Rate Analysis for Distributed Optimization , 2014, IEEE Transactions on Control of Network Systems.

[19] John N. Tsitsiklis,et al. Distributed Asynchronous Deterministic and Stochastic Gradient Optimization Algorithms , 1984, 1984 American Control Conference.

[20] H. Vincent Poor,et al. Scheduling Policies for Federated Learning in Wireless Networks , 2019, IEEE Transactions on Communications.

[21] Dan Alistarh,et al. QSGD: Communication-Optimal Stochastic Gradient Descent, with Applications to Training Neural Networks , 2016, 1610.02132.

[22] T. C. Aysal,et al. Distributed Average Consensus With Dithered Quantization , 2008, IEEE Transactions on Signal Processing.

[23] Deniz Gündüz,et al. Machine Learning at the Wireless Edge: Distributed Stochastic Gradient Descent Over-the-Air , 2019, 2019 IEEE International Symposium on Information Theory (ISIT).

[24] Anit Kumar Sahu,et al. Federated Learning: Challenges, Methods, and Future Directions , 2019, IEEE Signal Processing Magazine.

[25] Asuman E. Ozdaglar,et al. Distributed Subgradient Methods for Multi-Agent Optimization , 2009, IEEE Transactions on Automatic Control.

[26] Volkan Cevher,et al. Machine Learning From Distributed, Streaming Data [From the Guest Editors] , 2020, IEEE Signal Process. Mag..

[27] Michael G. Rabbat,et al. Distributed dual averaging for convex optimization under communication delays , 2012, 2012 American Control Conference (ACC).

[28] John N. Tsitsiklis,et al. Distributed subgradient methods and quantization effects , 2008, 2008 47th IEEE Conference on Decision and Control.

[29] Walid Saad,et al. A Joint Learning and Communications Framework for Federated Learning Over Wireless Networks , 2021, IEEE Transactions on Wireless Communications.