Over-the-Air Machine Learning at the Wireless Edge

We study distributed machine learning at the wireless edge, where limited power devices (workers) with local datasets implement distributed stochastic gradient descent (DSGD) over-the-air with the help of a remote parameter server (PS). We consider a bandwidth-limited fading multiple access channel (MAC) from the workers to the PS for communicating the local gradient estimates. Motivated by the additive nature of the wireless MAC, we study analog transmission of low-dimensional gradient estimates while accumulating error from previous iterations. We also design an opportunistic worker scheduling scheme to align the received gradient vectors at the PS in an efficient manner. Numerical results show that the proposed DSGD algorithm converges much faster than the state-of-the-art, while also providing a significantly higher accuracy.

[1]  Yong Wang,et al.  Low-Latency Broadband Analog Aggregation for Federated Edge Learning , 2018, ArXiv.

[2]  Deniz Gündüz,et al.  Machine Learning at the Wireless Edge: Distributed Stochastic Gradient Descent Over-the-Air , 2019, 2019 IEEE International Symposium on Information Theory (ISIT).

[3]  Andrea Montanari,et al.  Message-passing algorithms for compressed sensing , 2009, Proceedings of the National Academy of Sciences.

[4]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[5]  Dong Yu,et al.  1-bit stochastic gradient descent and its application to data-parallel distributed training of speech DNNs , 2014, INTERSPEECH.

[6]  Nikko Strom,et al.  Scalable distributed DNN training using commodity GPU cloud computing , 2015, INTERSPEECH.

[7]  Klaus-Robert Müller,et al.  Sparse Binary Compression: Towards Distributed Deep Learning with minimal Communication , 2018, 2019 International Joint Conference on Neural Networks (IJCNN).

[8]  Kenneth Heafield,et al.  Sparse Communication for Distributed Gradient Descent , 2017, EMNLP.

[9]  Pritish Narayanan,et al.  Deep Learning with Limited Numerical Precision , 2015, ICML.

[10]  Zhi Ding,et al.  Federated Learning via Over-the-Air Computation , 2018, IEEE Transactions on Wireless Communications.

[11]  Georgios B. Giannakis,et al.  LAG: Lazily Aggregated Gradient for Communication-Efficient Distributed Learning , 2018, NeurIPS.

[12]  Blaise Agüera y Arcas,et al.  Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[13]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[14]  Deniz Gündüz,et al.  Computation Scheduling for Distributed Machine Learning with Straggling Workers , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).