Distributed Privacy-Preserving Prediction

In privacy-preserving machine learning, individual parties are reluctant to share their sensitive training data due to privacy concerns. Even the trained model parameters or prediction can pose serious privacy leakage. To address these problems, we demonstrate a generally applicable Distributed Privacy-Preserving Prediction (DPPP) framework, in which instead of sharing more sensitive data or model parameters, an untrusted aggregator combines only multiple models' predictions under provable privacy guarantee. Our framework integrates two main techniques to guarantee individual privacy. First, we improve the previous analysis of the Binomial mechanism to achieve distributed differential privacy. Second, we utilize homomorphic encryption to ensure that the aggregator learns nothing but the noisy aggregated prediction. We empirically evaluate the effectiveness of our framework on various datasets, and compare it with other baselines. The experimental results demonstrate that our framework has comparable performance to the non-private frameworks and delivers better results than the local differentially private framework and standalone framework.

[1]  Blaise Agüera y Arcas,et al.  Federated Learning of Deep Networks using Model Averaging , 2016, ArXiv.

[2]  Elaine Shi,et al.  Privacy-Preserving Aggregation of Time-Series Data , 2011, NDSS.

[3]  Fan Zhang,et al.  Stealing Machine Learning Models via Prediction APIs , 2016, USENIX Security Symposium.

[4]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[5]  Claude Castelluccia,et al.  I Have a DREAM! (DiffeRentially privatE smArt Metering) , 2011, Information Hiding.

[6]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[7]  Cynthia Dwork,et al.  Practical privacy: the SuLQ framework , 2005, PODS.

[8]  R. Cramer,et al.  Multiparty Computation from Threshold Homomorphic Encryption , 2000 .

[9]  H. Brendan McMahan,et al.  Learning Differentially Private Recurrent Language Models , 2017, ICLR.

[10]  Moni Naor,et al.  Our Data, Ourselves: Privacy Via Distributed Noise Generation , 2006, EUROCRYPT.

[11]  Suman Nath,et al.  Differentially private aggregation of distributed time-series with transformation and encryption , 2010, SIGMOD Conference.

[12]  Sanjiv Kumar,et al.  cpSGD: Communication-efficient and differentially-private distributed SGD , 2018, NeurIPS.

[13]  Torben P. Pedersen A Threshold Cryptosystem without a Trusted Party (Extended Abstract) , 1991, EUROCRYPT.

[14]  Ananthram Swami,et al.  Practical Black-Box Attacks against Machine Learning , 2016, AsiaCCS.

[15]  Marimuthu Palaniswami,et al.  PPFA: Privacy Preserving Fog-Enabled Aggregation in Smart Grid , 2018, IEEE Transactions on Industrial Informatics.

[16]  Sarvar Patel,et al.  Practical Secure Aggregation for Privacy-Preserving Machine Learning , 2017, IACR Cryptol. ePrint Arch..

[17]  Ivan Damgård,et al.  A Generalisation, a Simplification and Some Applications of Paillier's Probabilistic Public-Key System , 2001, Public Key Cryptography.

[18]  Blaise Agüera y Arcas,et al.  Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[19]  Úlfar Erlingsson,et al.  Scalable Private Learning with PATE , 2018, ICLR.

[20]  Martín Abadi,et al.  Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data , 2016, ICLR.

[21]  Lucila Ohno-Machado,et al.  Protecting patient privacy by quantifiable control of disclosures in disseminated databases , 2004, Int. J. Medical Informatics.

[22]  Vitaly Shmatikov,et al.  Membership Inference Attacks Against Machine Learning Models , 2016, 2017 IEEE Symposium on Security and Privacy (SP).