Distributed Learning without Distress: Privacy-Preserving Empirical Risk Minimization

Distributed learning allows a group of independent data owners to collaboratively learn a model over their data sets without exposing their private data. We present a distributed learning approach that combines differential privacy with secure multi-party computation. We explore two popular methods of differential privacy, output perturbation and gradient perturbation, and advance the state-of-the-art for both methods in the distributed learning setting. In our output perturbation method, the parties combine local models within a secure computation and then add the required differential privacy noise before revealing the model. In our gradient perturbation method, the data owners collaboratively train a global model via an iterative learning algorithm. At each iteration, the parties aggregate their local gradients within a secure computation, adding sufficient noise to ensure privacy before the gradient updates are revealed. For both methods, we show that the noise can be reduced in the multi-party setting by adding the noise inside the secure computation after aggregation, asymptotically improving upon the best previous results. Experiments on real world data sets demonstrate that our methods provide substantial utility gains for typical privacy requirements.

[1]  Vitaly Shmatikov,et al.  Membership Inference Attacks Against Machine Learning Models , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[2]  Cynthia Dwork,et al.  Differential Privacy: A Survey of Results , 2008, TAMC.

[3]  Somesh Jha,et al.  Privacy in Pharmacogenetics: An End-to-End Case Study of Personalized Warfarin Dosing , 2014, USENIX Security Symposium.

[4]  M. E. Muller,et al.  A Note on the Generation of Random Normal Deviates , 1958 .

[5]  Quanquan Gu,et al.  Aggregating Private Sparse Learning Models Using Multi-Party Computation , 2016 .

[6]  Antti Honkela,et al.  Differentially private Bayesian learning on distributed data , 2017, NIPS.

[7]  Yehuda Lindell,et al.  Privacy Preserving Data Mining , 2000, Journal of Cryptology.

[8]  Kamalika Chaudhuri,et al.  Privacy-preserving logistic regression , 2008, NIPS.

[9]  Yehuda Lindell,et al.  Secure Multiparty Computation for Privacy-Preserving Data Mining , 2009, IACR Cryptol. ePrint Arch..

[10]  Anand D. Sarwate,et al.  Differentially Private Empirical Risk Minimization , 2009, J. Mach. Learn. Res..

[11]  Ersin Uzun,et al.  Achieving Differential Privacy in Secure Multiparty Data Aggregation Protocols on Star Networks , 2017, CODASPY.

[12]  Úlfar Erlingsson,et al.  The Secret Sharer: Measuring Unintended Neural Network Memorization & Extracting Secrets , 2018, ArXiv.

[13]  David Evans,et al.  Obliv-C: A Language for Extensible Data-Oblivious Computation , 2015, IACR Cryptol. ePrint Arch..

[14]  Jonathan Katz,et al.  Faster Secure Two-Party Computation Using Garbled Circuits , 2011, USENIX Security Symposium.

[15]  Jonathan Katz,et al.  Quid-Pro-Quo-tocols: Strengthening Semi-honest Protocols with Dual Execution , 2012, 2012 IEEE Symposium on Security and Privacy.

[16]  Arun Rajkumar,et al.  A Differentially Private Stochastic Gradient Descent Algorithm for Multiparty Classification , 2012, AISTATS.

[17]  Benny Pinkas,et al.  Fairplay - Secure Two-Party Computation System , 2004, USENIX Security Symposium.

[18]  Benny Pinkas,et al.  Secure Two-Party Computation is Practical , 2009, IACR Cryptol. ePrint Arch..

[19]  Yanjiao Chen,et al.  Privacy-Preserving Collaborative Model Learning: The Case of Word Vector Training , 2018, IEEE Transactions on Knowledge and Data Engineering.

[20]  Helmut Veith,et al.  Secure two-party computations in ANSI C , 2012, CCS.

[21]  Sarvar Patel,et al.  Practical Secure Aggregation for Privacy-Preserving Machine Learning , 2017, IACR Cryptol. ePrint Arch..

[22]  Mariana Raykova,et al.  Privacy-Preserving Distributed Linear Regression on High-Dimensional Data , 2017, Proc. Priv. Enhancing Technol..

[23]  Li Zhang,et al.  Learning Differentially Private Language Models Without Losing Accuracy , 2017, ArXiv.

[24]  Xenofontas A. Dimitropoulos,et al.  SEPIA: Privacy-Preserving Aggregation of Multi-Domain Network Events and Statistics , 2010, USENIX Security Symposium.

[25]  Bhiksha Raj,et al.  Multiparty Differential Privacy via Aggregation of Locally Trained Classifiers , 2010, NIPS.

[26]  Jonathan Katz,et al.  Global-Scale Secure Multiparty Computation , 2017, CCS.

[27]  Yehuda Lindell,et al.  An Efficient Protocol for Secure Two-Party Computation in the Presence of Malicious Adversaries , 2007, EUROCRYPT.

[28]  Liwei Wang,et al.  Efficient Private ERM for Smooth Objectives , 2017, IJCAI.

[29]  Melissa Chase,et al.  Private Collaborative Neural Network Learning , 2017, IACR Cryptol. ePrint Arch..

[30]  Di Wang,et al.  Differentially Private Empirical Risk Minimization Revisited: Faster and More General , 2018, NIPS.

[31]  Silvio Micali,et al.  A Completeness Theorem for Protocols with Honest Majority , 1987, STOC 1987.

[32]  Andrew Chi-Chih Yao,et al.  Protocols for secure computations , 1982, FOCS 1982.

[33]  Michael Walfish,et al.  Pretzel: Email encryption and provider-supplied functions are compatible , 2017, SIGCOMM.

[34]  Ivan Damgård,et al.  Multiparty Computation from Somewhat Homomorphic Encryption , 2012, IACR Cryptol. ePrint Arch..

[35]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[36]  Pravesh Kothari,et al.  25th Annual Conference on Learning Theory Differentially Private Online Learning , 2022 .

[37]  Moni Naor,et al.  Our Data, Ourselves: Privacy Via Distributed Noise Generation , 2006, EUROCRYPT.

[38]  Sheng Zhong,et al.  Privacy-Preserving Classification of Customer Data without Loss of Accuracy , 2005, SDM.

[39]  Michael Hicks,et al.  Wysteria: A Programming Language for Generic, Mixed-Mode Multiparty Computations , 2014, 2014 IEEE Symposium on Security and Privacy.

[40]  Prateek Jain,et al.  Differentially Private Learning with Kernels , 2013, ICML.

[41]  Elaine Shi,et al.  Distributed Private Data Analysis , 2017, ACM Trans. Algorithms.

[42]  Stratis Ioannidis,et al.  Privacy-Preserving Ridge Regression on Hundreds of Millions of Records , 2013, 2013 IEEE Symposium on Security and Privacy.

[43]  Claudio Orlandi,et al.  A New Approach to Practical Active-Secure Two-Party Computation , 2012, IACR Cryptol. ePrint Arch..

[44]  Chris Clifton,et al.  Privacy-preserving Naïve Bayes classification , 2008, The VLDB Journal.

[45]  Cynthia Dwork,et al.  Privacy-Preserving Datamining on Vertically Partitioned Databases , 2004, CRYPTO.

[46]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2016, J. Priv. Confidentiality.

[47]  Ivan Damgård,et al.  Asynchronous Multiparty Computation: Theory and Implementation , 2008, IACR Cryptol. ePrint Arch..

[48]  Nathan Srebro,et al.  Fast Rates for Regularized Objectives , 2008, NIPS.

[49]  Wen-Guey Tzeng,et al.  Privacy-preserving ridge regression on distributed data , 2018, Inf. Sci..

[50]  Vitaly Shmatikov,et al.  Privacy-preserving deep learning , 2015, Allerton.

[51]  Thomas Steinke,et al.  Concentrated Differential Privacy: Simplifications, Extensions, and Lower Bounds , 2016, TCC.

[52]  Jian Shen,et al.  Privacy preserving multi-party computation delegation for deep learning in cloud computing , 2018, Inf. Sci..