Sample Selection with Deadline Control for Efficient Federated Learning on Heterogeneous Clients

Federated Learning (FL) trains a machine learning model on distributed clients without exposing individual data. Unlike centralized training that is usually based on carefullyorganized data, FL deals with on-device data that are often unfiltered and imbalanced. As a result, conventional FL training protocol that treats all data equally leads to a waste of local computational resources and slows down the global learning process. To this end, we propose FedBalancer , a systematic FL framework that actively selects clients’ training samples. Our sample selection strategy prioritizes more “informative” data while respecting privacy and computational capabilities of clients. To better utilize the sample selection to speed up global training, we further introduce an adaptive deadline control scheme that predicts the optimal deadline for each round with varying client train data. Compared with existing FL algorithms with deadline configuration methods, our evaluation on five datasets from three different domains shows that FedBalancer improves the time-to-accuracy performance by 1.22∼4.62× while improving the model accuracy by 1.0∼3.3%. We also show that FedBalancer is readily applicable to other FL approaches by demonstrating that FedBalancer improves the convergence speed and accuracy when operating jointly with three different FL algorithms.

[1]  Nicholas D. Lane,et al.  FjORD: Fair and Accurate Federated Learning under heterogeneous targets with Ordered Dropout , 2021, NeurIPS.

[2]  Sebastian Caldas,et al.  LEAF: A Benchmark for Federated Settings , 2018, ArXiv.

[3]  Jae-Gil Lee,et al.  Learning from Noisy Labels with Deep Neural Networks: A Survey , 2020, ArXiv.

[4]  Feiyue Huang,et al.  CurricularFace: Adaptive Curriculum Learning Loss for Deep Face Recognition , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Takayuki Nishio,et al.  Client Selection for Federated Learning with Heterogeneous Resources in Mobile Edge , 2018, ICC 2019 - 2019 IEEE International Conference on Communications (ICC).

[6]  Aryan Mokhtari,et al.  Personalized Federated Learning with Theoretical Guarantees: A Model-Agnostic Meta-Learning Approach , 2020, NeurIPS.

[7]  Blaise Agüera y Arcas,et al.  Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[8]  Daphna Weinshall,et al.  On The Power of Curriculum Learning in Training Deep Networks , 2019, ICML.

[9]  Qiong Wu,et al.  FedHome: Cloud-Edge Based Personalized Federated Learning for In-Home Health Monitoring , 2020, IEEE Transactions on Mobile Computing.

[10]  Eunho Yang,et al.  Online Coreset Selection for Rehearsal-based Continual Learning , 2021, ArXiv.

[11]  Guoliang Xing,et al.  FedDL: Federated Learning via Dynamic Layer Sharing for Human Activity Recognition , 2021, SenSys.

[12]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[13]  La'ercio Lima Pilla,et al.  Optimal Task Assignment to Heterogeneous Federated Learning Devices , 2020, ArXiv.

[14]  Zoubin Ghahramani,et al.  Deep Bayesian Active Learning with Image Data , 2017, ICML.

[15]  Richard Nock,et al.  Advances and Open Problems in Federated Learning , 2021, Found. Trends Mach. Learn..

[16]  Jason Weston,et al.  Curriculum learning , 2009, ICML '09.

[17]  H. Li,et al.  Hermes: an efficient federated learning framework for heterogeneous mobile clients , 2021, MobiCom.

[18]  Hubert Eichner,et al.  APPLIED FEDERATED LEARNING: IMPROVING GOOGLE KEYBOARD QUERY SUGGESTIONS , 2018, ArXiv.

[19]  Shui Yu,et al.  Dynamic Sample Selection for Federated Learning with Heterogeneous Data in Fog Computing , 2020, ICC 2020 - 2020 IEEE International Conference on Communications (ICC).

[20]  Davide Anguita,et al.  A Public Domain Dataset for Human Activity Recognition using Smartphones , 2013, ESANN.

[21]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[22]  Daguang Xu,et al.  Privacy-preserving Federated Brain Tumour Segmentation , 2019, MLMI@MICCAI.

[23]  Zaïd Harchaoui,et al.  A Smoother Way to Train Structured Prediction Models , 2019, NeurIPS.

[24]  John Langford,et al.  Agnostic active learning , 2006, J. Comput. Syst. Sci..

[25]  Harsha V. Madhyastha,et al.  Oort: Informed Participant Selection for Scalable Federated Learning , 2020, ArXiv.

[26]  Hubert Eichner,et al.  Towards Federated Learning at Scale: System Design , 2019, SysML.

[27]  Jie Ding,et al.  HeteroFL: Computation and Communication Efficient Federated Learning for Heterogeneous Clients , 2020, ICLR.

[28]  Vladimir Vlassov,et al.  Human Activity Recognition Using Federated Learning , 2018, 2018 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Ubiquitous Computing & Communications, Big Data & Cloud Computing, Social Computing & Networking, Sustainable Computing & Communications (ISPA/IUCC/BDCloud/SocialCom/SustainCom).

[29]  Razvan Pascanu,et al.  Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.

[30]  Zhiyuan Xie,et al.  ClusterFL: a similarity-aware federated learning system for human activity recognition , 2021, MobiSys.

[31]  H. Vincent Poor,et al.  Federated Learning With Differential Privacy: Algorithms and Performance Analysis , 2019, IEEE Transactions on Information Forensics and Security.

[32]  Yanyao Shen,et al.  Learning with Bad Training Data via Iterative Trimmed Loss Minimization , 2018, ICML.

[33]  Frank Hutter,et al.  Online Batch Selection for Faster Training of Neural Networks , 2015, ArXiv.

[34]  Kaigui Bian,et al.  Characterizing Impacts of Heterogeneity in Federated Learning upon Large-Scale Smartphone Data , 2021, WWW.

[35]  Burak Kantarci,et al.  Federated Learning in Smart City Sensing: Challenges and Opportunities , 2020, Sensors.

[36]  François Fleuret,et al.  Not All Samples Are Created Equal: Deep Learning with Importance Sampling , 2018, ICML.

[37]  Yue Zhao,et al.  Federated Learning with Non-IID Data , 2018, ArXiv.

[38]  Tian Li,et al.  Fair Resource Allocation in Federated Learning , 2019, ICLR.

[39]  Shaojie Tang,et al.  Billion-scale federated learning on mobile clients: a submodel design with tunable privacy , 2020, MobiCom.

[40]  Changho Suh,et al.  Sample Selection for Fair and Robust Training , 2021, NeurIPS.

[41]  Abdullatif Albaseer,et al.  Federated Learning for Localization: A Privacy-Preserving Crowdsourcing Method , 2020, ArXiv.

[42]  Eryk Dutkiewicz,et al.  In-network Computation for Large-scale Federated Learning over Wireless Edge Networks , 2021, ArXiv.

[43]  H. Brendan McMahan,et al.  Learning Differentially Private Recurrent Language Models , 2017, ICLR.

[44]  Gregory Cohen,et al.  EMNIST: Extending MNIST to handwritten letters , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[45]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[46]  William Shakespeare,et al.  Complete Works of William Shakespeare , 1854 .

[47]  Cong Wang,et al.  Optimize Scheduling of Federated Learning on Battery-powered Mobile Devices , 2020, 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[48]  Yarin Gal,et al.  BatchBALD: Efficient and Diverse Batch Acquisition for Deep Bayesian Active Learning , 2019, NeurIPS.

[49]  Anit Kumar Sahu,et al.  Federated Optimization in Heterogeneous Networks , 2018, MLSys.

[50]  Hai Li,et al.  FedMask: Joint Computation and Communication-Efficient Personalized Federated Learning via Heterogeneous Masking , 2021, SenSys.

[51]  Yoshua Bengio,et al.  Variance Reduction in SGD by Distributed Importance Sampling , 2015, ArXiv.

[52]  Colin B. Compas,et al.  Federated learning for predicting clinical outcomes in patients with COVID-19 , 2021, Nature Medicine.

[53]  Junhao Wang,et al.  Sample-level Data Selection for Federated Learning , 2021, IEEE INFOCOM 2021 - IEEE Conference on Computer Communications.

[54]  Marco Canini,et al.  On the Impact of Device and Behavioral Heterogeneity in Federated Learning , 2021, ArXiv.

[55]  Nikolaos V. Sahinidis,et al.  Derivative-free optimization: a review of algorithms and comparison of software implementations , 2013, J. Glob. Optim..

[56]  Xiaofeng Liao,et al.  Federated Continuous Learning With Broad Network Architecture , 2021, IEEE Transactions on Cybernetics.

[57]  Tom Schaul,et al.  Prioritized Experience Replay , 2015, ICLR.

[58]  Bingsheng He,et al.  Federated Learning on Non-IID Data Silos: An Experimental Study , 2021, 2022 IEEE 38th International Conference on Data Engineering (ICDE).

[59]  Jianyu Wang,et al.  Client Selection in Federated Learning: Convergence Analysis and Power-of-Choice Selection Strategies , 2020, ArXiv.

[60]  Jae-Gil Lee,et al.  SELFIE: Refurbishing Unclean Samples for Robust Deep Learning , 2019, ICML.

[61]  Gemma Piella,et al.  Memory-aware curriculum federated learning for breast cancer classification , 2021, ArXiv.

[62]  Minghong Fang,et al.  FLTrust: Byzantine-robust Federated Learning via Trust Bootstrapping , 2021, NDSS.

[63]  Sreeram Kannan,et al.  Improving Federated Learning Personalization via Model Agnostic Meta Learning , 2019, ArXiv.

[64]  Kin K. Leung,et al.  Overcoming Noisy and Irrelevant Data in Federated Learning , 2020, 2020 25th International Conference on Pattern Recognition (ICPR).

[65]  Abdullatif Albaseer,et al.  Exploiting Unlabeled Data in Smart Cities using Federated Edge Learning , 2020, 2020 International Wireless Communications and Mobile Computing (IWCMC).

[66]  Osman Yagan,et al.  Bandit-based Communication-Efficient Client Selection Strategies for Federated Learning , 2020, 2020 54th Asilomar Conference on Signals, Systems, and Computers.

[67]  Alex Graves,et al.  Automated Curriculum Learning for Neural Networks , 2017, ICML.

[68]  Jun Wang,et al.  SmartPC: Hierarchical Pace Control in Real-Time Federated Learning System , 2019, 2019 IEEE Real-Time Systems Symposium (RTSS).

[69]  Peter Richtárik,et al.  Federated Learning: Strategies for Improving Communication Efficiency , 2016, ArXiv.