Communication-Computation Efficient Secure Aggregation for Federated Learning

Federated learning has been spotlighted as a way to train neural networks using data distributed over multiple nodes without the need for the nodes to share data. Unfortunately, it has also been shown that data privacy could not be fully guaranteed as adversaries may be able to extract certain information on local data from the model parameters transmitted during federated learning. A recent solution based on the secure aggregation primitive enabled privacy-preserving federated learning, but at the expense of significant extra communication/computational resources. In this paper, we propose communication-computation efficient secure aggregation which substantially reduces the amount of communication/computational resources relative to the existing secure solution without sacrificing data privacy. The key idea behind the suggested scheme is to design the topology of the secret-sharing nodes as sparse random graphs instead of the complete graph corresponding to the existing solution. We first obtain the necessary and sufficient condition on the graph to guarantee reliable and private federated learning in the information-theoretic sense. We then suggest using the Erdős-Renyi graph in particular and provide theoretical guarantees on the reliability/privacy of the proposed scheme. Through extensive real-world experiments, we demonstrate that our scheme, using only $20 \sim 30\%$ of the resources required in the conventional scheme, maintains virtually the same levels of reliability and data privacy in practical federated learning systems.

[1]  Vitaly Shmatikov,et al.  Membership Inference Attacks Against Machine Learning Models , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[2]  Rob Jansen,et al.  Safely Measuring Tor , 2016, CCS.

[3]  Vitaly Shmatikov,et al.  Privacy-preserving deep learning , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[4]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[5]  Claude Castelluccia,et al.  I Have a DREAM! (DiffeRentially privatE smArt Metering) , 2011, Information Hiding.

[6]  Maria-Florina Balcan,et al.  Distributed Learning, Communication Complexity and Privacy , 2012, COLT.

[7]  Li Xiong,et al.  A Comprehensive Comparison of Multiparty Secure Additions with Differential Privacy , 2017, IEEE Transactions on Dependable and Secure Computing.

[8]  Blaise Agüera y Arcas,et al.  Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[9]  Yongqiang Wang,et al.  ADMM Based Privacy-Preserving Decentralized Optimization , 2017, IEEE Transactions on Information Forensics and Security.

[10]  Refik Molva,et al.  PUDA - Privacy and Unforgeability for Data Aggregation , 2015, CANS.

[11]  Tassilo Klein,et al.  Differentially Private Federated Learning: A Client Level Perspective , 2017, ArXiv.

[12]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[13]  Richard Nock,et al.  Advances and Open Problems in Federated Learning , 2019, Found. Trends Mach. Learn..

[14]  Yehuda Lindell,et al.  Secure Computation on the Web: Computing without Simultaneous Interaction , 2011, IACR Cryptol. ePrint Arch..

[15]  Swaroop Ramaswamy,et al.  Federated Learning for Emoji Prediction in a Mobile Keyboard , 2019, ArXiv.

[16]  Refik Molva,et al.  Private and Dynamic Time-Series Data Aggregation with Trust Relaxation , 2014, CANS.

[17]  Tianqing Zhu,et al.  From distributed machine learning to federated learning: In the view of data privacy and security , 2020, Concurr. Comput. Pract. Exp..

[18]  Tianjian Chen,et al.  Federated Machine Learning: Concept and Applications , 2019 .

[19]  Peter B. Walker,et al.  Federated Learning for Healthcare Informatics , 2019, Journal of Healthcare Informatics Research.

[20]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[21]  Yuval Ishai,et al.  How Many Oblivious Transfers Are Needed for Secure Multiparty Computation? , 2007, CRYPTO.

[22]  Matthias Fitzi,et al.  Towards Optimal and Efficient Perfectly Secure Message Transmission , 2007, TCC.

[23]  George Danezis,et al.  PrivEx: Private Collection of Traffic Statistics for Anonymous Communication Networks , 2014, CCS.

[24]  Vitaly Shmatikov,et al.  Exploiting Unintended Feature Leakage in Collaborative Learning , 2018, 2019 IEEE Symposium on Security and Privacy (SP).

[25]  Jaekyun Moon,et al.  Election Coding for Distributed Learning: Protecting SignSGD against Byzantine Attacks , 2019, NeurIPS.

[26]  Dimitris S. Papailiopoulos,et al.  Approximate Gradient Coding via Sparse Random Graphs , 2017, ArXiv.

[27]  Wei Shi,et al.  Federated learning of predictive models from federated Electronic Health Records , 2018, Int. J. Medical Informatics.

[28]  H. Vincent Poor,et al.  Federated Learning With Differential Privacy: Algorithms and Performance Analysis , 2019, IEEE Transactions on Information Forensics and Security.

[29]  Ivan Damgård,et al.  Multiparty Computation from Somewhat Homomorphic Encryption , 2012, IACR Cryptol. ePrint Arch..

[30]  A. Salman Avestimehr,et al.  Turbo-Aggregate: Breaking the Quadratic Aggregation Barrier in Secure Federated Learning , 2020, IEEE Journal on Selected Areas in Information Theory.

[31]  Somesh Jha,et al.  Model Inversion Attacks that Exploit Confidence Information and Basic Countermeasures , 2015, CCS.

[32]  Rafael Wisniewski,et al.  Privacy Preservation in Distributed Optimization via Dual Decomposition and ADMM , 2019, 2019 IEEE 58th Conference on Decision and Control (CDC).

[33]  Yehuda Lindell,et al.  Efficient Constant Round Multi-Party Computation Combining BMR and SPDZ , 2015, IACR Cryptol. ePrint Arch..

[34]  Shiho Moriai,et al.  Privacy-Preserving Deep Learning via Additively Homomorphic Encryption , 2018, IEEE Transactions on Information Forensics and Security.

[35]  Wenqi Wei,et al.  LDP-Fed: federated learning with local differential privacy , 2020, EdgeSys@EuroSys.

[36]  Tancrède Lepoint,et al.  Secure Single-Server Aggregation with (Poly)Logarithmic Overhead , 2020, IACR Cryptol. ePrint Arch..

[37]  Elaine Shi,et al.  Privacy-Preserving Aggregation of Time-Series Data , 2011, NDSS.

[38]  Avi Wigderson,et al.  Completeness theorems for non-cryptographic fault-tolerant distributed computation , 1988, STOC '88.

[39]  Adi Shamir,et al.  How to share a secret , 1979, CACM.

[40]  Hubert Eichner,et al.  APPLIED FEDERATED LEARNING: IMPROVING GOOGLE KEYBOARD QUERY SUGGESTIONS , 2018, ArXiv.

[41]  Sarvar Patel,et al.  Practical Secure Aggregation for Privacy-Preserving Machine Learning , 2017, IACR Cryptol. ePrint Arch..