Federated Learning over Noisy Channels

Does Federated Learning (FL) work when both uplink and downlink communications have errors? How much communication noise can FL handle and what is its impact to the learning performance? This work is devoted to answering these practically important questions by explicitly incorporating both uplink and downlink noisy channels in the FL pipeline. We present several novel convergence analyses of FL over simultaneous uplink and downlink noisy communication channels, which encompass full and partial clients participation, direct model and model differential transmissions, and non-independent and identically distributed (IID) local datasets. These analyses characterize the sufficient conditions for FL over noisy channels to have the same convergence behavior as the ideal case of no communication error. More specifically, in order to maintain the O(1/T ) convergence rate of FEDAVG with perfect communications, the uplink and downlink signal-to-noise ratio (SNR) for direct model transmissions should be controlled such that they scale as O(t) where t is the index of communication rounds, but can stay O(1) (i.e., constant) for model differential transmissions. The key insight of these theoretical results is a “flying under the radar” principle – stochastic gradient descent (SGD) is an inherent noisy process and uplink/downlink communication noises can be tolerated as long as they do not dominate the time-varying SGD noise. We exemplify these theoretical findings with two widely adopted communication techniques – transmit power control and receive diversity combining – and further validate their performance advantages over the standard methods via numerical experiments using several real-world FL tasks.

[1]  H. Vincent Poor,et al.  Convergence Time Optimization for Federated Learning Over Wireless Networks , 2020, IEEE Transactions on Wireless Communications.

[2]  Slawomir Stanczak,et al.  Over-The-Air Computation for Distributed Machine Learning , 2020, ArXiv.

[3]  Vincent K. N. Lau,et al.  Analog Gradient Aggregation for Federated Learning Over Wireless Networks: Customized Design and Convergence Analysis , 2021, IEEE Internet of Things Journal.

[4]  Li Chen,et al.  Robust Federated Learning With Noisy Communication , 2019, IEEE Transactions on Communications.

[5]  Dimitris S. Papailiopoulos,et al.  Perturbed Iterate Analysis for Asynchronous Stochastic Optimization , 2015, SIAM J. Optim..

[6]  Ji Liu,et al.  DoubleSqueeze: Parallel Stochastic Gradient Descent with Double-Pass Error-Compensated Compression , 2019, ICML.

[7]  H. Vincent Poor,et al.  Scheduling Policies for Federated Learning in Wireless Networks , 2019, IEEE Transactions on Communications.

[8]  Kin K. Leung,et al.  Adaptive Federated Learning in Resource Constrained Edge Computing Systems , 2018, IEEE Journal on Selected Areas in Communications.

[9]  Aryan Mokhtari,et al.  Straggler-Resilient Federated Learning: Leveraging the Interplay Between Statistical Accuracy and System Heterogeneity , 2020, IEEE Journal on Selected Areas in Information Theory.

[10]  Deniz Gündüz,et al.  Convergence of Federated Learning Over a Noisy Downlink , 2020, IEEE Transactions on Wireless Communications.

[11]  Dan Alistarh,et al.  QSGD: Communication-Optimal Stochastic Gradient Descent, with Applications to Training Neural Networks , 2016, 1610.02132.

[12]  S. Kulkarni,et al.  Federated Learning With Quantized Global Model Updates , 2020, ArXiv.

[13]  Indranil Gupta,et al.  Practical Distributed Learning: Secure Machine Learning with Communication-Efficient Local Updates , 2019, ArXiv.

[14]  Richard Nock,et al.  Advances and Open Problems in Federated Learning , 2021, Found. Trends Mach. Learn..

[15]  Fan Zhou,et al.  On the convergence properties of a K-step averaging stochastic gradient descent algorithm for nonconvex optimization , 2017, IJCAI.

[16]  Ying-Chang Liang,et al.  Federated Learning in Mobile Edge Networks: A Comprehensive Survey , 2020, IEEE Communications Surveys & Tutorials.

[17]  Sebastian Caldas,et al.  LEAF: A Benchmark for Federated Settings , 2018, ArXiv.

[18]  Kaibin Huang,et al.  Broadband Analog Aggregation for Low-Latency Federated Edge Learning , 2018, IEEE Transactions on Wireless Communications.

[19]  Shaojie Tang,et al.  Secure Federated Submodel Learning , 2019, ArXiv.

[20]  Blaise Agüera y Arcas,et al.  Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[21]  Cong Shen,et al.  Design and Analysis of Uplink and Downlink Communications for Federated Learning , 2020, IEEE Journal on Selected Areas in Communications.

[22]  Xizixiang Wei,et al.  Federated Learning over Noisy Channels , 2021, ICC 2021 - IEEE International Conference on Communications.

[23]  Hubert Eichner,et al.  Towards Federated Learning at Scale: System Design , 2019, SysML.

[24]  Zhisheng Niu,et al.  Device Scheduling with Fast Convergence for Wireless Federated Learning , 2019, ICC 2020 - 2020 IEEE International Conference on Communications (ICC).

[25]  Yonina C. Eldar,et al.  Over-the-Air Federated Learning From Heterogeneous Data , 2020, IEEE Transactions on Signal Processing.

[26]  Kin K. Leung,et al.  Energy-Efficient Radio Resource Allocation for Federated Edge Learning , 2019, 2020 IEEE International Conference on Communications Workshops (ICC Workshops).

[27]  Shuguang Cui,et al.  Optimized Power Control for Over-the-Air Federated Edge Learning , 2020, ICC 2021 - IEEE International Conference on Communications.

[28]  Tian Li,et al.  Fair Resource Allocation in Federated Learning , 2019, ICLR.

[29]  Zhi Ding,et al.  Federated Learning via Over-the-Air Computation , 2018, IEEE Transactions on Wireless Communications.

[30]  Deniz Gündüz,et al.  Machine Learning at the Wireless Edge: Distributed Stochastic Gradient Descent Over-the-Air , 2019, 2019 IEEE International Symposium on Information Theory (ISIT).

[31]  Takayuki Nishio,et al.  Client Selection for Federated Learning with Heterogeneous Resources in Mobile Edge , 2018, ICC 2019 - 2019 IEEE International Conference on Communications (ICC).

[32]  Kaibin Huang,et al.  High-Dimensional Stochastic Gradient Quantization for Communication-Efficient Edge Learning , 2019, 2019 IEEE Global Conference on Signal and Information Processing (GlobalSIP).

[33]  Yong Zhou,et al.  Fast Convergence Algorithm for Analog Federated Learning , 2020, ICC 2021 - IEEE International Conference on Communications.

[34]  Deniz Gündüz,et al.  Federated Learning Over Wireless Fading Channels , 2019, IEEE Transactions on Wireless Communications.

[35]  Deniz Gündüz,et al.  Energy-Aware Analog Aggregation for Federated Learning with Redundant Data , 2019, ICC 2020 - 2020 IEEE International Conference on Communications (ICC).

[36]  Farzin Haddadpour,et al.  Local SGD with Periodic Averaging: Tighter Analysis and Adaptive Synchronization , 2019, NeurIPS.

[37]  Jie Xu,et al.  Energy-Efficient Federated Edge Learning with Joint Communication and Computation Design , 2020, J. Commun. Inf. Networks.

[38]  Jianyu Wang,et al.  Cooperative SGD: A unified Framework for the Design and Analysis of Communication-Efficient SGD Algorithms , 2018, ArXiv.

[39]  Walid Saad,et al.  A Joint Learning and Communications Framework for Federated Learning Over Wireless Networks , 2021, IEEE Transactions on Wireless Communications.

[40]  Sebastian Caldas,et al.  Expanding the Reach of Federated Learning by Reducing Client Resource Requirements , 2018, ArXiv.

[41]  Sheng Zhou,et al.  Cluster-Based Cooperative Digital Over-the-Air Aggregation for Wireless Federated Edge Learning , 2020, 2020 IEEE/CIC International Conference on Communications in China (ICCC).

[42]  Peng Jiang,et al.  A Linear Speedup Analysis of Distributed Deep Learning with Sparse and Quantized Communication , 2018, NeurIPS.

[43]  Anit Kumar Sahu,et al.  Federated Learning: Challenges, Methods, and Future Directions , 2019, IEEE Signal Processing Magazine.

[44]  Tim Salimans,et al.  Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks , 2016, NIPS.

[45]  Abbas Jamalipour,et al.  Wireless communications , 2005, GLOBECOM '05. IEEE Global Telecommunications Conference, 2005..

[46]  Swagath Venkataramani,et al.  ScaleCom: Scalable Sparsified Gradient Compression for Communication-Efficient Distributed Training , 2021, NeurIPS.

[47]  Sebastian U. Stich,et al.  Local SGD Converges Fast and Communicates Little , 2018, ICLR.

[48]  Deniz Gündüz,et al.  One-Bit Over-the-Air Aggregation for Communication-Efficient Federated Edge Learning: Design and Convergence Analysis , 2020, IEEE Transactions on Wireless Communications.

[49]  Jie Xu,et al.  Client Selection and Bandwidth Allocation in Wireless Federated Learning Networks: A Long-Term Perspective , 2020, IEEE Transactions on Wireless Communications.

[50]  Walid Saad,et al.  Energy Efficient Federated Learning Over Wireless Communication Networks , 2019, IEEE Transactions on Wireless Communications.

[51]  Peter Richtárik,et al.  Federated Learning: Strategies for Improving Communication Efficiency , 2016, ArXiv.

[52]  Xiang Li,et al.  On the Convergence of FedAvg on Non-IID Data , 2019, ICLR.

[53]  Longbo Huang,et al.  Double Quantization for Communication-Efficient Distributed Optimization , 2018, NeurIPS.

[54]  Mohammad Mohammadi Amiri,et al.  Machine Learning at the Wireless Edge: Distributed Stochastic Gradient Descent Over-the-Air , 2020 .

[55]  Kaibin Huang,et al.  Towards an Intelligent Edge: Wireless Communication Meets Machine Learning , 2018, ArXiv.

[56]  Aryan Mokhtari,et al.  FedPAQ: A Communication-Efficient Federated Learning Method with Periodic Averaging and Quantization , 2019, AISTATS.

[57]  Shengbo Chen,et al.  Dynamic Aggregation for Heterogeneous Quantization in Federated Learning , 2021, IEEE Transactions on Wireless Communications.

[58]  Rong Jin,et al.  On the Linear Speedup Analysis of Communication Efficient Momentum SGD for Distributed Non-Convex Optimization , 2019, ICML.

[59]  Anit Kumar Sahu,et al.  Federated Optimization in Heterogeneous Networks , 2018, MLSys.

[60]  Gaurav Kapoor,et al.  Protection Against Reconstruction and Its Applications in Private Federated Learning , 2018, ArXiv.

[61]  Shuguang Cui,et al.  Federated Learning for 6G: Applications, Challenges, and Opportunities , 2021, Engineering.