Towards Understanding Quality Challenges of the Federated Learning: A First Look from the Lens of Robustness

Federated learning (FL) is a widely adopted distributed learning paradigm in practice, which intends to preserve users’ data privacy while leveraging the entire dataset of all participants for training. In FL, multiple models are trained independently on the users and aggregated centrally to update a global model in an iterative process. Although this approach is excellent at preserving privacy by design, FL still tends to suffer from quality issues such as attacks or byzantine faults. Some recent attempts have been made to address such quality challenges on the robust aggregation techniques for FL. However, the effectiveness of state-of-the-art (SOTA) robust FL techniques is still unclear and lacks a comprehensive study. Therefore, to better understand the current quality status and challenges of these SOTA FL techniques in the presence of attacks and faults, in this paper, we perform a large-scale empirical study to investigate the SOTA FL’s quality from multiple angles of attacks, simulated faults (via mutation operators), and aggregation (defense) methods. In particular, we perform our study on two generic image datasets and one real-world federated medical image dataset. We also systematically investigate the effect of the distribution of attacks/faults over users and the independent and identically distributed (IID) factors, A. Eslami Abyane University of Calgary E-mail: amin.eslamiabyane@ucalgary.ca D. Zhu Technical University of Munich E-mail: derui.zhu@tum.de R. Souza University of Calgary E-mail: roberto.medeirosdeso@ucalgary.ca L. Ma Univeristy of Alberta E-mail: ma.lei@acm.org H. Hemmati University of Calgary E-mail: hadi.hemmati@ucalgary.ca ar X iv :2 20 1. 01 40 9v 1 [ cs .L G ] 5 J an 2 02 2 2 Amin Eslami Abyane et al. per dataset, on the robustness results. After a large-scale analysis with 496 configurations, we find that most mutators on each individual user have a negligible effect on the final model. Moreover, choosing the most robust FL aggregator depends on the attacks and datasets. Finally, we illustrate that it is possible to achieve a generic solution that works almost as well or even better than any single aggregator on all attacks and configurations with a simple ensemble model of aggregators. Our replication package is available online: https://github.com/aminesi/federated

[1]  Kannan Ramchandran,et al.  Byzantine-Robust Distributed Learning: Towards Optimal Statistical Rates , 2018, ICML.

[2]  Lei Ma,et al.  DeepMutation++: A Mutation Testing Framework for Deep Learning Systems , 2019, 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[3]  Harsha V. Madhyastha,et al.  Oort: Informed Participant Selection for Scalable Federated Learning , 2020, ArXiv.

[4]  Vitaly Shmatikov,et al.  How To Backdoor Federated Learning , 2018, AISTATS.

[5]  Heiko Ludwig,et al.  TiFL: A Tier-based Federated Learning System , 2020, HPDC.

[6]  Qifeng Chen,et al.  Robust Federated Learning with Attack-Adaptive Aggregation , 2021, ArXiv.

[7]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[8]  Qing Ling,et al.  RSA: Byzantine-Robust Stochastic Aggregation Methods for Distributed Learning from Heterogeneous Datasets , 2018, AAAI.

[9]  Virginia Smith,et al.  Ditto: Fair and Robust Federated Learning Through Personalization , 2020, ICML.

[10]  Siddharth Garg,et al.  BadNets: Evaluating Backdooring Attacks on Deep Neural Networks , 2019, IEEE Access.

[11]  Anit Kumar Sahu,et al.  Federated Learning: Challenges, Methods, and Future Directions , 2019, IEEE Signal Processing Magazine.

[12]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[13]  Jun Wan,et al.  MuNN: Mutation Analysis of Neural Networks , 2018, 2018 IEEE International Conference on Software Quality, Reliability and Security Companion (QRS-C).

[14]  Richard Nock,et al.  Advances and Open Problems in Federated Learning , 2021, Found. Trends Mach. Learn..

[15]  Anit Kumar Sahu,et al.  Federated Optimization in Heterogeneous Networks , 2018, MLSys.

[16]  Prateek Mittal,et al.  Analyzing Federated Learning through an Adversarial Lens , 2018, ICML.

[17]  Dawn Xiaodong Song,et al.  Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning , 2017, ArXiv.

[18]  Kuo-Yi Lin,et al.  A Survey on federated learning* , 2020, 2020 IEEE 16th International Conference on Control & Automation (ICCA).

[19]  Jinyuan Jia,et al.  Local Model Poisoning Attacks to Byzantine-Robust Federated Learning , 2019, USENIX Security Symposium.

[20]  Rachid Guerraoui,et al.  Machine Learning with Adversaries: Byzantine Tolerant Gradient Descent , 2017, NIPS.

[21]  Sarvar Patel,et al.  Practical Secure Aggregation for Privacy-Preserving Machine Learning , 2017, IACR Cryptol. ePrint Arch..

[22]  Philip S. Yu,et al.  Privacy and Robustness in Federated Learning: Attacks and Defenses , 2020, IEEE transactions on neural networks and learning systems.

[23]  Lingchen Zhao,et al.  SEAR: Secure and Efficient Aggregation for Byzantine-Robust Federated Learning , 2022, IEEE Transactions on Dependable and Secure Computing.

[24]  Paolo Tonella,et al.  DeepCrime: mutation testing of deep learning systems based on real faults , 2021, ISSTA.

[25]  C. Jack,et al.  Alzheimer's Disease Neuroimaging Initiative , 2008 .

[26]  Blaise Agüera y Arcas,et al.  Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[27]  Jingyi Wang,et al.  Adversarial Sample Detection for Deep Neural Network through Model Mutation Testing , 2018, 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE).

[28]  Roland Vollgraf,et al.  Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[29]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[30]  Takayuki Nishio,et al.  Client Selection for Federated Learning with Heterogeneous Resources in Mobile Edge , 2018, ICC 2019 - 2019 IEEE International Conference on Communications (ICC).

[31]  Ananda Theertha Suresh,et al.  Can You Really Backdoor Federated Learning? , 2019, ArXiv.

[32]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[33]  Yasaman Khazaeni,et al.  Federated Learning with Matched Averaging , 2020, ICLR.

[34]  Bo Li,et al.  Attack-Resistant Federated Learning with Residual-based Reweighting , 2019, ArXiv.

[35]  Ivan Beschastnikh,et al.  The Limitations of Federated Learning in Sybil Settings , 2020, RAID.

[36]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[37]  Zeyi Tao,et al.  A survey of federated learning for edge computing: Research problems and solutions , 2021, High-Confidence Computing.

[38]  Zaïd Harchaoui,et al.  Robust Aggregation for Federated Learning , 2019, IEEE Transactions on Signal Processing.

[39]  Fabio Roli,et al.  Towards Poisoning of Deep Learning Algorithms with Back-gradient Optimization , 2017, AISec@CCS.