Decentralized Federated Learning via SGD over Wireless D2D Networks

Federated Learning (FL), an emerging paradigm for fast intelligent acquisition at the network edge, enables joint training of a machine learning model over distributed data sets and computing resources with limited disclosure of local data. Communication is a critical enabler of large-scale FL due to significant amount of model information exchanged among edge devices. In this paper, we consider a network of wireless devices sharing a common fading wireless channel for the deployment of FL. Each device holds a generally distinct training set, and communication typically takes place in a Device-to-Device (D2D) manner. In the ideal case in which all devices within communication range can communicate simultaneously and noiselessly, a standard protocol that is guaranteed to converge to an optimal solution of the global empirical risk minimization problem under convexity and connectivity assumptions is Decentralized Stochastic Gradient Descent (DSGD). DSGD integrates local SGD steps with periodic consensus averages that require communication between neighboring devices. In this paper, wireless protocols are proposed that implement DSGD by accounting for the presence of path loss, fading, blockages, and mutual interference. The proposed protocols are based on graph coloring for scheduling and on both digital and analog transmission strategies at the physical layer, with the latter leveraging over-the-air computing via sparsity-based recovery.

[1]  John Langford,et al.  Scaling up machine learning: parallel and distributed approaches , 2011, KDD '11 Tutorials.

[2]  Angelia Nedic,et al.  Distributed Stochastic Subgradient Projection Algorithms for Convex Optimization , 2008, J. Optim. Theory Appl..

[3]  Suhas N. Diggavi,et al.  SPARQ-SGD: Event-Triggered and Compressed Communication in Decentralized Stochastic Optimization , 2019, ArXiv.

[4]  Stephen P. Boyd,et al.  Fast linear iterations for distributed averaging , 2003, 42nd IEEE International Conference on Decision and Control (IEEE Cat. No.03CH37475).

[5]  Soummya Kar,et al.  An introduction to decentralized stochastic optimization with gradient tracking , 2019 .

[6]  Richard Nock,et al.  Advances and Open Problems in Federated Learning , 2019, Found. Trends Mach. Learn..

[7]  Walid Saad,et al.  A Joint Learning and Communications Framework for Federated Learning Over Wireless Networks , 2021, IEEE Transactions on Wireless Communications.

[8]  Deniz Gündüz,et al.  One-Bit Over-the-Air Aggregation for Communication-Efficient Federated Edge Learning: Design and Convergence Analysis , 2020, ArXiv.

[9]  Deniz Gündüz,et al.  Federated Learning Over Wireless Fading Channels , 2019, IEEE Transactions on Wireless Communications.

[10]  Songtao Lu,et al.  Distributed Learning in the Non-Convex World: From Batch to Streaming Data, and Beyond , 2020, ArXiv.

[11]  Thore Husfeldt,et al.  Graph colouring algorithms , 2015, ArXiv.

[12]  Suhas Diggavi,et al.  Qsparse-Local-SGD: Distributed SGD With Quantization, Sparsification, and Local Computations , 2019, IEEE Journal on Selected Areas in Information Theory.

[13]  N. Meinshausen,et al.  LASSO-TYPE RECOVERY OF SPARSE REPRESENTATIONS FOR HIGH-DIMENSIONAL DATA , 2008, 0806.0145.

[14]  Philippe Ciblat,et al.  Analysis of Max-Consensus Algorithms in Wireless Channels , 2012, IEEE Transactions on Signal Processing.

[15]  Monica Nicoli,et al.  Federated Learning With Cooperating Devices: A Consensus Approach for Massive IoT Networks , 2019, IEEE Internet of Things Journal.

[16]  Deniz Gündüz,et al.  Machine Learning at the Wireless Edge: Distributed Stochastic Gradient Descent Over-the-Air , 2019, 2019 IEEE International Symposium on Information Theory (ISIT).

[17]  Joonhyuk Kang,et al.  Wireless Federated Distillation for Distributed Edge Learning with Heterogeneous Data , 2019, 2019 IEEE 30th Annual International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC).

[18]  Anit Kumar Sahu,et al.  MATCHA: Speeding Up Decentralized SGD via Matching Decomposition Sampling , 2019, 2019 Sixth Indian Control Conference (ICC).