Physical-Layer Arithmetic for Federated Learning in Uplink MU-MIMO Enabled Wireless Networks

Federated learning is a very promising machine learning paradigm where a large number of clients cooperatively train a global model using their respective local data. In this paper, we consider the application of federated learning in wireless networks featuring uplink multiuser multiple-input and multiple-output (MU-MIMO), and aim at optimizing the communication efficiency during the aggregation of client-side updates by exploiting the inherent superposition of radio frequency (RF) signals. We propose a novel approach named Physical-Layer Arithmetic (PhyArith), where the clients encode their local updates into aligned digital sequences which are converted into RF signals for sending to the server simultaneously, and the server directly recovers the exact summation of these updates as required from the superimposed RF signal by employing a customized sum-product algorithm. PhyArith is compatible with commodity devices due to the use of full digital operation in both the client-side encoding and the server-side decoding processes, and can also be integrated with other updates compression based acceleration techniques. Simulation results show that PhyArith further improves the communication efficiency by 1.5 to 3 times for training LeNet-5, compared with solutions only applying updates compression.

[1]  Robert G. Gallager,et al.  Low-density parity-check codes , 1962, IRE Trans. Inf. Theory.

[2]  Zhi Ding,et al.  Federated Learning via Over-the-Air Computation , 2018, IEEE Transactions on Wireless Communications.

[3]  Sam Ade Jacobs,et al.  Communication Quantization for Data-Parallel Training of Deep Neural Networks , 2016, 2016 2nd Workshop on Machine Learning in HPC Environments (MLHPC).

[4]  Lei Liu,et al.  Convergence Analysis and Assurance for Gaussian Message Passing Iterative Detector in Massive MU-MIMO Systems , 2016, IEEE Transactions on Wireless Communications.

[5]  Kaibin Huang,et al.  Towards an Intelligent Edge: Wireless Communication Meets Machine Learning , 2018, ArXiv.

[6]  Vitaly Shmatikov,et al.  Privacy-preserving deep learning , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[7]  E. Candes,et al.  11-magic : Recovery of sparse signals via convex programming , 2005 .

[8]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[9]  Brendan J. Frey,et al.  Factor graphs and the sum-product algorithm , 2001, IEEE Trans. Inf. Theory.

[10]  Peter Richtárik,et al.  Federated Learning: Strategies for Improving Communication Efficiency , 2016, ArXiv.

[11]  W. W. PETERSONt,et al.  Cyclic Codes for Error Detection * , 2022 .

[12]  Nikko Strom,et al.  Scalable distributed DNN training using commodity GPU cloud computing , 2015, INTERSPEECH.

[13]  Jianhua Lu,et al.  Low-Complexity Iterative Detection for Large-Scale Multiuser MIMO-OFDM Systems Using Approximate Message Passing , 2014, IEEE Journal of Selected Topics in Signal Processing.

[14]  Blaise Agüera y Arcas,et al.  Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[15]  David Tse,et al.  Fundamentals of Wireless Communication , 2005 .

[16]  Akbar M. Sayeed,et al.  Deconstructing multiantenna fading channels , 2002, IEEE Trans. Signal Process..

[17]  Dong Yu,et al.  1-bit stochastic gradient descent and its application to data-parallel distributed training of speech DNNs , 2014, INTERSPEECH.

[18]  Giuseppe Caire,et al.  Iterative multiuser joint decoding: Unified framework and asymptotic analysis , 2002, IEEE Trans. Inf. Theory.

[19]  Dan Alistarh,et al.  QSGD: Communication-Optimal Stochastic Gradient Descent, with Applications to Training Neural Networks , 2016, 1610.02132.

[20]  Evgeny Khorov,et al.  A Tutorial on IEEE 802.11ax High Efficiency WLANs , 2019, IEEE Communications Surveys & Tutorials.

[21]  Michael Gastpar,et al.  Computation Over Multiple-Access Channels , 2007, IEEE Transactions on Information Theory.

[22]  Yong Wang,et al.  Low-Latency Broadband Analog Aggregation for Federated Edge Learning , 2018, ArXiv.

[23]  Deniz Gündüz,et al.  Machine Learning at the Wireless Edge: Distributed Stochastic Gradient Descent Over-the-Air , 2019, 2019 IEEE International Symposium on Information Theory (ISIT).

[24]  Peter Richtárik,et al.  Federated Optimization: Distributed Machine Learning for On-Device Intelligence , 2016, ArXiv.