Privacy and Robustness in Federated Learning: Attacks and Defenses

As data are increasingly being stored in different silos and societies becoming more aware of data privacy issues, the traditional centralized training of artificial intelligence (AI) models is facing efficiency and privacy challenges. Recently, federated learning (FL) has emerged as an alternative solution and continue to thrive in this new reality. Existing FL protocol design has been shown to be vulnerable to adversaries within or outside of the system, compromising data privacy and system robustness. Besides training powerful global models, it is of paramount importance to design FL systems that have privacy guarantees and are resistant to different types of adversaries. In this paper, we conduct the first comprehensive survey on this topic. Through a concise introduction to the concept of FL, and a unique taxonomy covering: 1) threat models; 2) poisoning attacks and defenses against robustness; 3) inference attacks and defenses against privacy, we provide an accessible review of this important topic. We highlight the intuitions, key techniques as well as fundamental assumptions adopted by various attacks and defenses. Finally, we discuss promising future research directions towards robust and privacy-preserving federated learning.

[1]  James Bailey,et al.  Clean-Label Backdoor Attacks on Video Recognition Models , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Wen-Chuan Lee,et al.  Trojaning Attack on Neural Networks , 2018, NDSS.

[3]  Lingjuan Lyu,et al.  How to Democratise and Protect AI: Fair and Differentially Private Decentralised Deep Learning , 2020, IEEE Transactions on Dependable and Secure Computing.

[4]  Martín Abadi,et al.  Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data , 2016, ICLR.

[5]  Mikhail Belkin,et al.  Learning privately from multiparty data , 2016, ICML.

[6]  Rachid Guerraoui,et al.  The Hidden Vulnerability of Distributed Learning in Byzantium , 2018, ICML.

[7]  Richard Nock,et al.  Advances and Open Problems in Federated Learning , 2019, Found. Trends Mach. Learn..

[8]  Lingjuan Lyu,et al.  Collaborative Fairness in Federated Learning , 2020, Federated Learning.

[9]  Blaine Nelson,et al.  Exploiting Machine Learning to Subvert Your Spam Filter , 2008, LEET.

[10]  Tudor Dumitras,et al.  Poison Frogs! Targeted Clean-Label Poisoning Attacks on Neural Networks , 2018, NeurIPS.

[11]  Amir Houmansadr,et al.  Cronus: Robust and Heterogeneous Collaborative Learning with Black-Box Knowledge Transfer , 2019, ArXiv.

[12]  Lingjuan Lyu,et al.  FORESEEN: Towards Differentially Private Deep Inference for Intelligent Internet of Things , 2020, IEEE Journal on Selected Areas in Communications.

[13]  Ankur Srivastava,et al.  Neural Trojans , 2017, 2017 IEEE International Conference on Computer Design (ICCD).

[14]  Benjamin Edwards,et al.  Detecting Backdoor Attacks on Deep Neural Networks by Activation Clustering , 2018, SafeAI@AAAI.

[15]  Richard Nock,et al.  Private federated learning on vertically partitioned data via entity resolution and additively homomorphic encryption , 2017, ArXiv.

[16]  Ivan Damgård,et al.  Multiparty Computation from Somewhat Homomorphic Encryption , 2012, IACR Cryptol. ePrint Arch..

[17]  Jun Zhao,et al.  Local Differential Privacy-Based Federated Learning for Internet of Things , 2020, IEEE Internet of Things Journal.

[18]  Blaine Nelson,et al.  The security of machine learning , 2010, Machine Learning.

[19]  James Bailey,et al.  Improving Adversarial Robustness Requires Revisiting Misclassified Examples , 2020, ICLR.

[20]  Leslie Lamport,et al.  The Byzantine Generals Problem , 1982, TOPL.

[21]  Lingjuan Lyu,et al.  Federated Model Distillation with Noise-Free Differential Privacy , 2020, ArXiv.

[22]  Sudipta Chattopadhyay,et al.  Exposing Backdoors in Robust Machine Learning Models , 2020, ArXiv.

[23]  Shanshan Peng,et al.  Model Agnostic Defence Against Backdoor Attacks in Machine Learning , 2019, IEEE Transactions on Reliability.

[24]  Shiho Moriai,et al.  Privacy-Preserving Deep Learning via Additively Homomorphic Encryption , 2018, IEEE Transactions on Information Forensics and Security.

[25]  Tassilo Klein,et al.  Differentially Private Federated Learning: A Client Level Perspective , 2017, ArXiv.

[26]  Lingjuan Lyu,et al.  Towards Building a Robust and Fair Federated Learning System , 2020, ArXiv.

[27]  Sarvar Patel,et al.  Practical Secure Aggregation for Privacy-Preserving Machine Learning , 2017, IACR Cryptol. ePrint Arch..

[28]  Ananda Theertha Suresh,et al.  Can You Really Backdoor Federated Learning? , 2019, ArXiv.

[29]  Prateek Saxena,et al.  Auror: defending against poisoning attacks in collaborative deep learning systems , 2016, ACSAC.

[30]  Kartik Sreenivasan,et al.  Attack of the Tails: Yes, You Really Can Backdoor Federated Learning , 2020, NeurIPS.

[31]  Ivor W. Tsang,et al.  On the Convergence of a Family of Robust Losses for Stochastic Gradient Descent , 2016, ECML/PKDD.

[32]  Farinaz Koushanfar,et al.  Chameleon: A Hybrid Secure Computation Framework for Machine Learning Applications , 2018, IACR Cryptol. ePrint Arch..

[33]  Edward Chou,et al.  SentiNet: Detecting Localized Universal Attacks Against Deep Learning Systems , 2020, 2020 IEEE Security and Privacy Workshops (SPW).

[34]  Aleksander Madry,et al.  Robustness May Be at Odds with Accuracy , 2018, ICLR.

[35]  Aleksander Madry,et al.  Clean-Label Backdoor Attacks , 2018 .

[36]  Pascal Paillier,et al.  Public-Key Cryptosystems Based on Composite Degree Residuosity Classes , 1999, EUROCRYPT.

[37]  Damith C. Ranasinghe,et al.  Februus: Input Purification Defense Against Trojan Attacks on Deep Neural Network Systems , 2020, ACSAC.

[38]  Chenglin Miao,et al.  Towards Data Poisoning Attacks in Crowd Sensing Systems , 2018, MobiHoc.

[39]  Reza Shokri,et al.  Bypassing Backdoor Detection Algorithms in Deep Learning , 2019, 2020 IEEE European Symposium on Security and Privacy (EuroS&P).

[40]  H. Brendan McMahan,et al.  Differentially Private Learning with Adaptive Clipping , 2019, NeurIPS.

[41]  Samy Bengio,et al.  Understanding deep learning requires rethinking generalization , 2016, ICLR.

[42]  Indranil Gupta,et al.  Generalized Byzantine-tolerant SGD , 2018, ArXiv.

[43]  Lingjuan Lyu,et al.  Lightweight Crypto-Assisted Distributed Differential Privacy for Privacy-Preserving Distributed Learning , 2020, 2020 International Joint Conference on Neural Networks (IJCNN).

[44]  Chenglin Miao,et al.  Attack under Disguise: An Intelligent Data Poisoning Attack Mechanism in Crowdsourcing , 2018, WWW.

[45]  Blaine Nelson,et al.  Support Vector Machines Under Adversarial Label Noise , 2011, ACML.

[46]  James Bailey,et al.  On the Convergence and Robustness of Adversarial Training , 2021, ICML.

[47]  Percy Liang,et al.  Certified Defenses for Data Poisoning Attacks , 2017, NIPS.

[48]  Ling Huang,et al.  ANTIDOTE: understanding and defending against poisoning of anomaly detectors , 2009, IMC '09.

[49]  Rachid Guerraoui,et al.  Machine Learning with Adversaries: Byzantine Tolerant Gradient Descent , 2017, NIPS.

[50]  Lili Su,et al.  Distributed Statistical Machine Learning in Adversarial Settings: Byzantine Gradient Descent , 2019, PERV.

[51]  Kamyar Azizzadenesheli,et al.  signSGD with Majority Vote is Communication Efficient And Byzantine Fault Tolerant , 2018, ArXiv.

[52]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[53]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[54]  Michael Zohner,et al.  ABY - A Framework for Efficient Mixed-Protocol Secure Two-Party Computation , 2015, NDSS.

[55]  Somesh Jha,et al.  Model Inversion Attacks that Exploit Confidence Information and Basic Countermeasures , 2015, CCS.

[56]  Jeffrey Li,et al.  Differentially Private Meta-Learning , 2020, ICLR.

[57]  Brendan Dolan-Gavitt,et al.  Fine-Pruning: Defending Against Backdooring Attacks on Deep Neural Networks , 2018, RAID.

[58]  Gu Si-yang,et al.  Privacy preserving association rule mining in vertically partitioned data , 2006 .

[59]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[60]  Adi Shamir,et al.  A method for obtaining digital signatures and public-key cryptosystems , 1978, CACM.

[61]  Tianqing Zhu,et al.  Local Differential Privacy and Its Applications: A Comprehensive Survey , 2020, ArXiv.

[62]  Marimuthu Palaniswami,et al.  PPFA: Privacy Preserving Fog-Enabled Aggregation in Smart Grid , 2018, IEEE Transactions on Industrial Informatics.

[63]  Kamyar Azizzadenesheli,et al.  signSGD with Majority Vote is Communication Efficient and Fault Tolerant , 2018, ICLR.

[64]  Baoyuan Wu,et al.  Rethinking the Trigger of Backdoor Attack , 2020, ArXiv.

[65]  Blaine Nelson,et al.  Poisoning Attacks against Support Vector Machines , 2012, ICML.

[66]  Rui Zhang,et al.  A Hybrid Approach to Privacy-Preserving Federated Learning , 2018, Informatik Spektrum.

[67]  Ge Yu,et al.  Collecting and Analyzing Multidimensional Data with Local Differential Privacy , 2019, 2019 IEEE 35th International Conference on Data Engineering (ICDE).

[68]  Claude Castelluccia,et al.  I Have a DREAM! (DiffeRentially privatE smArt Metering) , 2011, Information Hiding.

[69]  Lingjuan Lyu,et al.  Fog-Embedded Deep Learning for the Internet of Things , 2019, IEEE Transactions on Industrial Informatics.

[70]  Han Yu,et al.  Privacy-preserving Heterogeneous Federated Transfer Learning , 2019, 2019 IEEE International Conference on Big Data (Big Data).

[71]  Moni Naor,et al.  Our Data, Ourselves: Privacy Via Distributed Noise Generation , 2006, EUROCRYPT.

[72]  Li Xiong,et al.  A Comprehensive Comparison of Multiparty Secure Additions with Differential Privacy , 2017, IEEE Transactions on Dependable and Secure Computing.

[73]  Xiaoqian Jiang,et al.  Secure Logistic Regression Based on Homomorphic Encryption: Design and Evaluation , 2018, IACR Cryptol. ePrint Arch..

[74]  Chenglin Miao,et al.  Towards Data Poisoning Attack against Knowledge Graph Embedding , 2019, ArXiv.

[75]  Úlfar Erlingsson,et al.  Scalable Private Learning with PATE , 2018, ICLR.

[76]  J. Doug Tygar,et al.  Adversarial machine learning , 2019, AISec '11.

[77]  Wenqi Wei,et al.  LDP-Fed: federated learning with local differential privacy , 2020, EdgeSys@EuroSys.

[78]  Percy Liang,et al.  Understanding Black-box Predictions via Influence Functions , 2017, ICML.

[79]  Jiong Jin,et al.  Towards Fair and Privacy-Preserving Federated Deep Models , 2019, IEEE Transactions on Parallel and Distributed Systems.

[80]  Taher El Gamal A public key cryptosystem and a signature scheme based on discrete logarithms , 1984, IEEE Trans. Inf. Theory.

[81]  Yin Yang,et al.  Collecting and Analyzing Data from Smart Device Users with Local Differential Privacy , 2016, ArXiv.

[82]  Philip S. Yu,et al.  LDP-FL: Practical Private Aggregation in Federated Learning with Local Differential Privacy , 2020, ArXiv.

[83]  Tianjian Chen,et al.  FedVision: An Online Visual Object Detection Platform Powered by Federated Learning , 2020, AAAI.

[84]  Yee Wei Law,et al.  Distributed Privacy-Preserving Prediction , 2019, ArXiv.

[85]  Bo Zhao,et al.  iDLG: Improved Deep Leakage from Gradients , 2020, ArXiv.

[86]  Elaine Shi,et al.  Optimal Lower Bound for Differentially Private Multi-party Aggregation , 2012, ESA.

[87]  Gregory Valiant,et al.  Resilience: A Criterion for Learning in the Presence of Arbitrary Outliers , 2017, ITCS.

[88]  Lixin Fan,et al.  Federated Learning: Privacy and Incentive , 2020, Federated Learning.

[89]  Suman Nath,et al.  Differentially private aggregation of distributed time-series with transformation and encryption , 2010, SIGMOD Conference.

[90]  Jerry Li,et al.  Spectral Signatures in Backdoor Attacks , 2018, NeurIPS.

[91]  Brendan Dolan-Gavitt,et al.  BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain , 2017, ArXiv.

[92]  Amir Houmansadr,et al.  Comprehensive Privacy Analysis of Deep Learning: Passive and Active White-box Inference Attacks against Centralized and Federated Learning , 2018, 2019 IEEE Symposium on Security and Privacy (SP).

[93]  John R. Douceur,et al.  The Sybil Attack , 2002, IPTPS.

[94]  Vitaly Shmatikov,et al.  Membership Inference Attacks Against Machine Learning Models , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[95]  Mehdi Bennis,et al.  Communication-Efficient On-Device Machine Learning: Federated Distillation and Augmentation under Non-IID Private Data , 2018, ArXiv.

[96]  Lingjuan Lyu,et al.  Differentially Private Knowledge Distillation for Mobile Analytics , 2020, SIGIR.

[97]  Vitaly Shmatikov,et al.  Privacy-preserving deep learning , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[98]  Chenglin Miao,et al.  Data Poisoning Attack against Knowledge Graph Embedding , 2019, IJCAI.

[99]  Úlfar Erlingsson,et al.  RAPPOR: Randomized Aggregatable Privacy-Preserving Ordinal Response , 2014, CCS.

[100]  Yulia R. Gel,et al.  Defending Against Backdoors in Federated Learning with Robust Learning Rate , 2020, AAAI.

[101]  Mianxiong Dong,et al.  Learning IoT in Edge: Deep Learning for the Internet of Things with Edge Computing , 2018, IEEE Network.

[102]  Sanjiv Kumar,et al.  cpSGD: Communication-efficient and differentially-private distributed SGD , 2018, NeurIPS.

[103]  Vitaly Shmatikov,et al.  Exploiting Unintended Feature Leakage in Collaborative Learning , 2018, 2019 IEEE Symposium on Security and Privacy (SP).

[104]  Mariana Raykova,et al.  Secure Computation for Machine Learning With SPDZ , 2019, ArXiv.

[105]  Minghong Fang,et al.  Local Model Poisoning Attacks to Byzantine-Robust Federated Learning , 2019, USENIX Security Symposium.

[106]  Andrew Chi-Chih Yao,et al.  Protocols for secure computations , 1982, FOCS 1982.

[107]  Indranil Gupta,et al.  Fall of Empires: Breaking Byzantine-tolerant SGD by Inner Product Manipulation , 2019, UAI.

[108]  Yunfei Liu,et al.  Reflection Backdoor: A Natural Backdoor Attack on Deep Neural Networks , 2020, ECCV.

[109]  Tianjian Chen,et al.  Federated Machine Learning: Concept and Applications , 2019 .

[110]  Elaine Shi,et al.  Foundations of Differentially Oblivious Algorithms , 2017, IACR Cryptol. ePrint Arch..

[111]  Fabio Roli,et al.  Towards Poisoning of Deep Learning Algorithms with Back-gradient Optimization , 2017, AISec@CCS.

[112]  Yew-Soon Ong,et al.  Poison as a Cure: Detecting & Neutralizing Variable-Sized Backdoor Attacks in Deep Neural Networks , 2019, ArXiv.

[113]  Bharat Bhargava,et al.  ConFoc: Content-Focus Protection Against Trojan Attacks on Neural Networks , 2020, ArXiv.

[114]  Dawn Xiaodong Song,et al.  Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning , 2017, ArXiv.

[115]  H. Brendan McMahan,et al.  Learning Differentially Private Recurrent Language Models , 2017, ICLR.

[116]  Gaurav Kapoor,et al.  Protection Against Reconstruction and Its Applications in Private Federated Learning , 2018, ArXiv.

[117]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[118]  Lili Su,et al.  Securing Distributed Gradient Descent in High Dimensional Statistical Learning , 2018, Proc. ACM Meas. Anal. Comput. Syst..

[119]  Peter Rindal,et al.  ABY3: A Mixed Protocol Framework for Machine Learning , 2018, IACR Cryptol. ePrint Arch..

[120]  Giuseppe Ateniese,et al.  Deep Models Under the GAN: Information Leakage from Collaborative Deep Learning , 2017, CCS.

[121]  Anand D. Sarwate,et al.  Differentially Private Empirical Risk Minimization , 2009, J. Mach. Learn. Res..

[122]  Lingjuan Lyu,et al.  Differentially Private Representation for NLP: Formal Guarantee and An Empirical Study on Privacy and Fairness , 2020, FINDINGS.

[123]  Dan Alistarh,et al.  Byzantine Stochastic Gradient Descent , 2018, NeurIPS.

[124]  Kannan Ramchandran,et al.  Byzantine-Robust Distributed Learning: Towards Optimal Statistical Rates , 2018, ICML.

[125]  Payman Mohassel,et al.  SecureML: A System for Scalable Privacy-Preserving Machine Learning , 2017, 2017 IEEE Symposium on Security and Privacy (SP).

[126]  Lingjuan Lyu Privacy-preserving machine learning and data aggregation for Internet of Things , 2018 .

[127]  Haixu Tang,et al.  Demon in the Variant: Statistical Analysis of DNNs for Robust Backdoor Contamination Detection , 2019, USENIX Security Symposium.

[128]  Lili Su,et al.  Securing Distributed Machine Learning in High Dimensions , 2018, ArXiv.

[129]  S L Warner,et al.  Randomized response: a survey technique for eliminating evasive answer bias. , 1965, Journal of the American Statistical Association.

[130]  Blaine Nelson,et al.  Can machine learning be secure? , 2006, ASIACCS '06.

[131]  Blaise Agüera y Arcas,et al.  Federated Learning of Deep Networks using Model Averaging , 2016, ArXiv.

[132]  Rachid Guerraoui,et al.  AGGREGATHOR: Byzantine Machine Learning via Robust Gradient Aggregation , 2019, SysML.

[133]  Moran Baruch,et al.  A Little Is Enough: Circumventing Defenses For Distributed Learning , 2019, NeurIPS.

[134]  Junpu Wang,et al.  FedMD: Heterogenous Federated Learning via Model Distillation , 2019, ArXiv.

[135]  Zaïd Harchaoui,et al.  Robust Aggregation for Federated Learning , 2019, IEEE Transactions on Signal Processing.

[136]  Song Han,et al.  Deep Leakage from Gradients , 2019, NeurIPS.

[137]  Dimitris S. Papailiopoulos,et al.  DRACO: Byzantine-resilient Distributed Training via Redundant Gradients , 2018, ICML.

[138]  Chris Clifton,et al.  Privacy-preserving distributed mining of association rules on horizontally partitioned data , 2004, IEEE Transactions on Knowledge and Data Engineering.

[139]  Anit Kumar Sahu,et al.  Federated Learning: Challenges, Methods, and Future Directions , 2019, IEEE Signal Processing Magazine.

[140]  Martin J. Wainwright,et al.  Local privacy and statistical minimax rates , 2013, 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[141]  Chang Liu,et al.  Manipulating Machine Learning: Poisoning Attacks and Countermeasures for Regression Learning , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[142]  Bo Li,et al.  DBA: Distributed Backdoor Attacks against Federated Learning , 2020, ICLR.

[143]  Vitaly Shmatikov,et al.  How To Backdoor Federated Learning , 2018, AISTATS.

[144]  Craig Gentry,et al.  Fully homomorphic encryption using ideal lattices , 2009, STOC '09.

[145]  Prateek Mittal,et al.  Analyzing Federated Learning through an Adversarial Lens , 2018, ICML.

[146]  Blaise Agüera y Arcas,et al.  Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[147]  Han Yu,et al.  Threats to Federated Learning: A Survey , 2020, ArXiv.

[148]  Ivan Beschastnikh,et al.  Mitigating Sybils in Federated Learning Poisoning , 2018, ArXiv.

[149]  Yoshinori Aono,et al.  Scalable and Secure Logistic Regression via Homomorphic Encryption , 2016, IACR Cryptol. ePrint Arch..