Cost-Effective Federated Learning Design

Federated learning (FL) is a distributed learning paradigm that enables a large number of devices to collaboratively learn a model without sharing their raw data. Despite its practical efficiency and effectiveness, the iterative on-device learning process incurs a considerable cost in terms of learning time and energy consumption, which depends crucially on the number of selected clients and the number of local iterations in each training round. In this paper, we analyze how to design adaptive FL that optimally chooses these essential control variables to minimize the total cost while ensuring convergence. Theoretically, we analytically establish the relationship between the total cost and the control variables with the convergence upper bound. To efficiently solve the cost minimization problem, we develop a low-cost sampling-based algorithm to learn the convergence related unknown parameters. We derive important solution properties that effectively identify the design principles for different metric preferences. Practically, we evaluate our theoretical results both in a simulated environment and on a hardware prototype. Experimental evidence verifies our derived properties and demonstrates that our proposed solution achieves near-optimal performance for various datasets, different machine learning models, and heterogeneous system settings.

[1]  Shenghuo Zhu,et al.  Parallel Restarted SGD for Non-Convex Optimization with Faster Convergence and Less Communication , 2018, ArXiv.

[2]  Kathrin Klamroth,et al.  Biconvex sets and optimization with biconvex functions: a survey and extensions , 2007, Math. Methods Oper. Res..

[3]  Mehdi Bennis,et al.  Wireless Network Intelligence at the Edge , 2018, Proceedings of the IEEE.

[4]  Walid Saad,et al.  Federated Learning for Ultra-Reliable Low-Latency V2V Communications , 2018, 2018 IEEE Global Communications Conference (GLOBECOM).

[5]  Richard Nock,et al.  Advances and Open Problems in Federated Learning , 2021, Found. Trends Mach. Learn..

[6]  Hubert Eichner,et al.  Towards Federated Learning at Scale: System Design , 2019, SysML.

[7]  Kin K. Leung,et al.  Energy-Efficient Resource Management for Federated Edge Learning With CPU-GPU Heterogeneous Computing , 2020, IEEE Transactions on Wireless Communications.

[8]  Albert Y. Zomaya,et al.  Federated Learning over Wireless Networks: Optimization Model Design and Analysis , 2019, IEEE INFOCOM 2019 - IEEE Conference on Computer Communications.

[9]  Ameet Talwalkar,et al.  Federated Multi-Task Learning , 2017, NIPS.

[10]  Walid Saad,et al.  Energy Efficient Federated Learning Over Wireless Communication Networks , 2019, IEEE Transactions on Wireless Communications.

[11]  Kin K. Leung,et al.  Adaptive Gradient Sparsification for Efficient Federated Learning: An Online Learning Approach , 2020, ArXiv.

[12]  Peter Richtárik,et al.  Federated Learning: Strategies for Improving Communication Efficiency , 2016, ArXiv.

[13]  Leandros Tassiulas,et al.  Model Pruning Enables Efficient Federated Learning on Edge Devices , 2019, ArXiv.

[14]  Li Chen,et al.  Accelerating Federated Learning via Momentum Gradient Descent , 2019, IEEE Transactions on Parallel and Distributed Systems.

[15]  Tianjian Chen,et al.  Federated Machine Learning: Concept and Applications , 2019 .

[16]  Onur Mutlu,et al.  Gaia: Geo-Distributed Machine Learning Approaching LAN Speeds , 2017, NSDI.

[17]  Xiang Li,et al.  On the Convergence of FedAvg on Non-IID Data , 2019, ICLR.

[18]  Anit Kumar Sahu,et al.  Federated Optimization in Heterogeneous Networks , 2018, MLSys.

[19]  Jie Xu,et al.  Energy-Efficient Federated Edge Learning with Joint Communication and Computation Design , 2020, J. Commun. Inf. Networks.

[20]  Jianyu Wang,et al.  Cooperative SGD: A unified Framework for the Design and Analysis of Communication-Efficient SGD Algorithms , 2018, ArXiv.

[21]  Sarvar Patel,et al.  Practical Secure Aggregation for Federated Learning on User-Held Data , 2016, ArXiv.

[22]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[23]  Sanglu Lu,et al.  Resource-Efficient and Convergence-Preserving Online Participant Selection in Federated Learning , 2020, 2020 IEEE 40th International Conference on Distributed Computing Systems (ICDCS).

[24]  Carlee Joe-Wong,et al.  Network-Aware Optimization of Distributed Learning for Fog Computing , 2020, IEEE/ACM Transactions on Networking.

[25]  Peter Richtárik,et al.  First Analysis of Local GD on Heterogeneous Data , 2019, ArXiv.

[26]  Takayuki Nishio,et al.  Client Selection for Federated Learning with Heterogeneous Resources in Mobile Edge , 2018, ICC 2019 - 2019 IEEE International Conference on Communications (ICC).

[27]  Wei Wang,et al.  CMFL: Mitigating Communication Overhead for Federated Learning , 2019, 2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS).

[28]  Jianyu Wang,et al.  Adaptive Communication Strategies to Achieve the Best Error-Runtime Trade-off in Local-Update SGD , 2018, MLSys.

[29]  Klaus-Robert Müller,et al.  Robust and Communication-Efficient Federated Learning From Non-i.i.d. Data , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[30]  Anit Kumar Sahu,et al.  Federated Learning: Challenges, Methods, and Future Directions , 2019, IEEE Signal Processing Magazine.

[31]  Sebastian U. Stich,et al.  Local SGD Converges Fast and Communicates Little , 2018, ICLR.

[32]  Hao Wang,et al.  Optimizing Federated Learning on Non-IID Data with Reinforcement Learning , 2020, IEEE INFOCOM 2020 - IEEE Conference on Computer Communications.

[33]  Peter Richtárik,et al.  Federated Optimization: Distributed Machine Learning for On-Device Intelligence , 2016, ArXiv.

[34]  Blaise Agüera y Arcas,et al.  Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[35]  Zhisheng Niu,et al.  Device Scheduling with Fast Convergence for Wireless Federated Learning , 2019, ICC 2020 - 2020 IEEE International Conference on Communications (ICC).

[36]  Fan Zhou,et al.  On the convergence properties of a K-step averaging stochastic gradient descent algorithm for nonconvex optimization , 2017, IJCAI.

[37]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[38]  Kaibin Huang,et al.  Broadband Analog Aggregation for Low-Latency Federated Edge Learning , 2018, IEEE Transactions on Wireless Communications.

[39]  Benjamin Livshits,et al.  BLENDER: Enabling Local Search with a Hybrid Differential Privacy Model , 2017, USENIX Security Symposium.

[40]  H. Vincent Poor,et al.  Convergence Time Optimization for Federated Learning Over Wireless Networks , 2020, IEEE Transactions on Wireless Communications.

[41]  Kin K. Leung,et al.  Adaptive Federated Learning in Resource Constrained Edge Computing Systems , 2018, IEEE Journal on Selected Areas in Communications.