Guess what? You can boost Federated Learning for free

Federated learning (FL) exploits the computation power of edge devices, typically mobile phones, while addressing privacy by letting data stay where it is produced. FL has been used by major service providers to improve item recommendations, virtual keyboards and text auto-completion services. While appealing, FL performance is hampered by multiple factors: (i) differing capabilities of participating clients (e.g., computing power, memory and network connectivity); (ii) strict training constraints where devices must be idle, plugged-in and connected to an unmetered WiFi; and (iii) data heterogeneity (a.k.a non-IIDness). Together, these lead to uneven participation, straggling, dropout and consequently slow down convergence, challenging the practicality of FL for many applications. In this paper, we present GEL, the Guess and Learn algorithm, that significantly speeds up convergence by guessing model updates for each client. The power of GEL is to effectively perform “free” learning steps without any additional gradient computations. GEL provides these guesses through clever use of moments in the ADAM optimizer in combination with the last computed gradient on clients. Our extensive experimental study involving five standard FL benchmarks shows that GEL speeds up the convergence up to 1.64× in heterogeneous systems in the presence of data non-IIDness, saving tens of thousands of gradient computations.

[1]  Richard Nock,et al.  Advances and Open Problems in Federated Learning , 2021, Found. Trends Mach. Learn..

[2]  Marco Canini,et al.  Towards Mitigating Device Heterogeneity in Federated Learning via Adaptive Model Quantization , 2021, EuroMLSys@EuroSys.

[3]  Sashank J. Reddi,et al.  SCAFFOLD: Stochastic Controlled Averaging for Federated Learning , 2019, ICML.

[4]  Martin J. Wainwright,et al.  FedSplit: An algorithmic framework for fast federated optimization , 2020, NeurIPS.

[5]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[6]  Anit Kumar Sahu,et al.  Federated Optimization in Heterogeneous Networks , 2018, MLSys.

[7]  Shiho Moriai,et al.  Privacy-Preserving Deep Learning via Additively Homomorphic Encryption , 2018, IEEE Transactions on Information Forensics and Security.

[8]  Kaigui Bian,et al.  Characterizing Impacts of Heterogeneity in Federated Learning upon Large-Scale Smartphone Data , 2021, WWW.

[9]  Suhas Diggavi,et al.  A Field Guide to Federated Optimization , 2021, ArXiv.

[10]  Tian Li,et al.  Fair Resource Allocation in Federated Learning , 2019, ICLR.

[11]  Sarvar Patel,et al.  Practical Secure Aggregation for Privacy-Preserving Machine Learning , 2017, IACR Cryptol. ePrint Arch..

[12]  Hubert Eichner,et al.  APPLIED FEDERATED LEARNING: IMPROVING GOOGLE KEYBOARD QUERY SUGGESTIONS , 2018, ArXiv.

[13]  Rafael C. González,et al.  Digital image processing, 3rd Edition , 2008 .

[14]  Hubert Eichner,et al.  Towards Federated Learning at Scale: System Design , 2019, SysML.

[15]  Tengyu Ma,et al.  Federated Accelerated Stochastic Gradient Descent , 2020, NeurIPS.

[16]  Blaise Agüera y Arcas,et al.  Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[17]  Marco Canini,et al.  On the Impact of Device and Behavioral Heterogeneity in Federated Learning , 2021, ArXiv.

[18]  Anit Kumar Sahu,et al.  Federated Learning: Challenges, Methods, and Future Directions , 2019, IEEE Signal Processing Magazine.

[19]  Parul Parashar,et al.  Neural Networks in Machine Learning , 2014 .

[20]  Sebastian Caldas,et al.  LEAF: A Benchmark for Federated Settings , 2018, ArXiv.