Federated Machine Learning

Today’s artificial intelligence still faces two major challenges. One is that, in most industries, data exists in the form of isolated islands. The other is the strengthening of data privacy and security. We propose a possible solution to these challenges: secure federated learning. Beyond the federated-learning framework first proposed by Google in 2016, we introduce a comprehensive secure federated-learning framework, which includes horizontal federated learning, vertical federated learning, and federated transfer learning. We provide definitions, architectures, and applications for the federated-learning framework, and provide a comprehensive survey of existing works on this subject. In addition, we propose building data networks among organizations based on federated mechanisms as an effective solution to allowing knowledge to be shared without compromising user privacy.

[1]  Ronald L. Rivest,et al.  ON DATA BANKS AND PRIVACY HOMOMORPHISMS , 1978 .

[2]  Andrew Chi-Chih Yao,et al.  Protocols for secure computations , 1982, FOCS 1982.

[3]  Silvio Micali,et al.  How to play ANY mental game , 1987, STOC.

[4]  A. Sheth Federated database systems for managing distributed, heterogeneous, and autonomous databases , 1990, CSUR.

[5]  Ramakrishnan Srikant,et al.  Privacy-preserving data mining , 2000, SIGMOD '00.

[6]  Wenliang Du,et al.  Privacy-preserving cooperative statistical analysis , 2001, Seventeenth Annual Computer Security Applications Conference.

[7]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[8]  Jaideep Vaidya,et al.  Privacy preserving association rule mining in vertically partitioned data , 2002, KDD.

[9]  Wenliang Du,et al.  Building decision tree classifier on private data , 2002 .

[10]  Chris Clifton,et al.  Privacy-preserving k-means clustering over vertically partitioned data , 2003, KDD '03.

[11]  Sudarshan S. Chawathe,et al.  Privacy-Preserving Inter-database Operations , 2004, ISI.

[12]  Chris Clifton,et al.  Privacy Preserving Naïve Bayes Classifier for Vertically Partitioned Data , 2004, SDM.

[13]  Chris Clifton,et al.  Privacy-preserving distributed mining of association rules on horizontally partitioned data , 2004, IEEE Transactions on Knowledge and Data Engineering.

[14]  Xiaodong Lin,et al.  Privacy preserving regression modelling via distributed computation , 2004, KDD.

[15]  Yunghsiang Sam Han,et al.  Privacy-Preserving Multivariate Statistical Analysis: Linear Regression and Classification , 2004, SDM.

[16]  Chris Clifton,et al.  Privacy-Preserving Decision Trees over Vertically Partitioned Data , 2005, DBSec.

[17]  Jaideep Vaidya,et al.  Privacy-preserving SVM using nonlinear kernels on horizontally partitioned data , 2006, SAC.

[18]  Jaideep Vaidya,et al.  Privacy-Preserving SVM Classification on Vertically Partitioned Data , 2006, PAKDD.

[19]  Li Wan,et al.  Privacy-preservation for gradient descent methods , 2007, KDD '07.

[20]  Elisa Bertino,et al.  Privacy preserving schema and data matching , 2007, SIGMOD '07.

[21]  Dan Bogdanov,et al.  Sharemind: A Framework for Fast Privacy-Preserving Computations , 2008, ESORICS.

[22]  Cynthia Dwork,et al.  Differential Privacy: A Survey of Results , 2008, TAMC.

[23]  Kamalika Chaudhuri,et al.  Privacy-preserving logistic regression , 2008, NIPS.

[24]  Jerome P. Reiter,et al.  Privacy-Preserving Analysis of Vertically Partitioned Data Using Secure Matrix Products , 2009 .

[25]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[26]  S. Fienberg,et al.  Secure multiple linear regression based on homomorphic encryption , 2011 .

[27]  Stratis Ioannidis,et al.  Privacy-Preserving Ridge Regression on Hundreds of Millions of Records , 2013, 2013 IEEE Symposium on Security and Privacy.

[28]  Seunghak Lee,et al.  More Effective Distributed ML via a Stale Synchronous Parallel Parameter Server , 2013, NIPS.

[29]  Anand D. Sarwate,et al.  Stochastic gradient descent with differentially private updates , 2013, 2013 IEEE Global Conference on Signal and Information Processing.

[30]  Shucheng Yu,et al.  Privacy Preserving Back-Propagation Neural Network Learning Made Practical with Cloud Computing , 2014, IEEE Transactions on Parallel and Distributed Systems.

[31]  Vitaly Shmatikov,et al.  Privacy-preserving deep learning , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[32]  Ye Zhang,et al.  Fast and Secure Three-party Computation: The Garbled Circuit Approach , 2015, IACR Cryptol. ePrint Arch..

[33]  Yehuda Lindell,et al.  High-Throughput Semi-Honest Secure Three-Party Computation with an Honest Majority , 2016, IACR Cryptol. ePrint Arch..

[34]  Mariana Raykova,et al.  Secure Linear Regression on Vertically Partitioned Datasets , 2016, IACR Cryptol. ePrint Arch..

[35]  Michael Naehrig,et al.  CryptoNets: applying neural networks to encrypted data with high throughput and accuracy , 2016, ICML 2016.

[36]  Peter Richtárik,et al.  Federated Optimization: Distributed Machine Learning for On-Device Intelligence , 2016, ArXiv.

[37]  Peter Richtárik,et al.  Federated Learning: Strategies for Improving Communication Efficiency , 2016, ArXiv.

[38]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[39]  Blaise Agüera y Arcas,et al.  Federated Learning of Deep Networks using Model Averaging , 2016, ArXiv.

[40]  Ahmad-Reza Sadeghi,et al.  Secure Multiparty Computation from SGX , 2017, Financial Cryptography.

[41]  Yoshinori Aono,et al.  Scalable and Secure Logistic Regression via Homomorphic Encryption , 2016, IACR Cryptol. ePrint Arch..

[42]  Laurence T. Yang,et al.  Privacy Preserving Deep Computation Model on Cloud for Big Data Feature Learning , 2016, IEEE Transactions on Computers.

[43]  Yehuda Lindell,et al.  High-Throughput Secure Three-Party Computation for Malicious Adversaries and an Honest Majority , 2017, IACR Cryptol. ePrint Arch..

[44]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[45]  Ameet Talwalkar,et al.  Federated Multi-Task Learning , 2017, NIPS.

[46]  Yao Lu,et al.  Oblivious Neural Network Predictions via MiniONN Transformations , 2017, IACR Cryptol. ePrint Arch..

[47]  Richard Nock,et al.  Private federated learning on vertically partitioned data via entity resolution and additively homomorphic encryption , 2017, ArXiv.

[48]  Payman Mohassel,et al.  SecureML: A System for Scalable Privacy-Preserving Machine Learning , 2017, 2017 IEEE Symposium on Security and Privacy (SP).

[49]  Farinaz Koushanfar,et al.  Chameleon: A Hybrid Secure Computation Framework for Machine Learning Applications , 2018, IACR Cryptol. ePrint Arch..

[50]  Giuseppe Ateniese,et al.  Deep Models Under the GAN: Information Leakage from Collaborative Deep Learning , 2017, CCS.

[51]  Li Zhang,et al.  Learning Differentially Private Language Models Without Losing Accuracy , 2017, ArXiv.

[52]  Hassan Takabi,et al.  CryptoDL: Deep Neural Networks over Encrypted Data , 2017, ArXiv.

[53]  Pascal Paillier,et al.  Fast Homomorphic Evaluation of Deep Discretized Neural Networks , 2018, IACR Cryptol. ePrint Arch..

[54]  Tassilo Klein,et al.  Differentially Private Federated Learning: A Client Level Perspective , 2017, ArXiv.

[55]  Somesh Jha,et al.  Privacy-Preserving Ridge Regression with only Linearly-Homomorphic Encryption , 2018, IACR Cryptol. ePrint Arch..

[56]  Constance Morel,et al.  Privacy-Preserving Classification on Deep Neural Network , 2017, IACR Cryptol. ePrint Arch..

[57]  Boi Faltings,et al.  Game Theory for Data Science: Eliciting Truthful Information , 2017, Game Theory for Data Science.

[58]  Sarvar Patel,et al.  Practical Secure Aggregation for Privacy-Preserving Machine Learning , 2017, IACR Cryptol. ePrint Arch..

[59]  Kin K. Leung,et al.  When Edge Meets Learning: Adaptive Control for Resource-Constrained Distributed Machine Learning , 2018, IEEE INFOCOM 2018 - IEEE Conference on Computer Communications.

[60]  Mehdi Bennis,et al.  On-Device Federated Learning via Blockchain and its Latency Analysis , 2018, ArXiv.

[61]  Lili Su,et al.  Securing Distributed Machine Learning in High Dimensions , 2018, ArXiv.

[62]  Farinaz Koushanfar,et al.  DeepSecure: Scalable Provably-Secure Deep Learning , 2017, 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC).

[63]  Xiaoqian Jiang,et al.  Secure Logistic Regression Based on Homomorphic Encryption: Design and Evaluation , 2018, IACR Cryptol. ePrint Arch..

[64]  Shiho Moriai,et al.  Privacy-Preserving Deep Learning via Additively Homomorphic Encryption , 2018, IEEE Transactions on Information Forensics and Security.

[65]  Yue Zhao,et al.  Federated Learning with Non-IID Data , 2018, ArXiv.

[66]  Peter Rindal,et al.  ABY3: A Mixed Protocol Framework for Machine Learning , 2018, IACR Cryptol. ePrint Arch..

[67]  Richard Nock,et al.  Entity Resolution and Federated Learning get a Federated Resolution , 2018, ArXiv.

[68]  Vitaly Shmatikov,et al.  Inference Attacks Against Collaborative Learning , 2018, ArXiv.

[69]  Zhenguo Li,et al.  Federated Meta-Learning for Recommendation , 2018, ArXiv.

[70]  William J. Dally,et al.  Deep Gradient Compression: Reducing the Communication Bandwidth for Distributed Training , 2017, ICLR.

[71]  Mauro Conti,et al.  A Survey on Homomorphic Encryption Schemes , 2017, ACM Comput. Surv..

[72]  Krishna P. Gummadi,et al.  Blind Justice: Fairness with Encrypted Sensitive Attributes , 2018, ICML.

[73]  Yang Liu,et al.  Federated Learning , 2019, Synthesis Lectures on Artificial Intelligence and Machine Learning.

[74]  Vitaly Shmatikov,et al.  How To Backdoor Federated Learning , 2018, AISTATS.