Near-Zero-Cost Differentially Private Deep Learning with Teacher Ensembles

Ensuring the privacy of sensitive data used to train modern machine learning models is of paramount importance in many areas of practice. One approach to study these concerns is through the lens of differential privacy. In this framework, privacy guarantees are generally obtained by perturbing models in such a way that specifics of data used to train the model are made ambiguous. A particular instance of this approach is through a “teacher-student” model, wherein the teacher, who owns the sensitive data, provides the student with useful, but noisy, information, hopefully allowing the student model to perform well on a given task without access to particular features of the sensitive data. Because stronger privacy guarantees generally involve more significant noising on the part of the teacher, deploying existing frameworks fundamentally involves a trade-off between utility and privacy guarantee. One of the most important techniques used in previous work involves an ensemble of teacher models, which return information to a student based on a noisy voting procedure. In this work, we propose a novel voting mechanism, which we call an Immutable Noisy ArgMax, that, under certain conditions, can bear very large random noising from the teacher without affecting the useful information transferred to the student. Our mechanisms improve over the state-of-the-art methods on all measures, and scale to larger tasks with both higher utility and stronger privacy ( ≈ 0).

[1]  Guy N. Rothblum,et al.  Boosting and Differential Privacy , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[2]  Martin J. Wainwright,et al.  Local privacy and statistical minimax rates , 2013, 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[3]  Anand D. Sarwate,et al.  Differentially Private Empirical Risk Minimization , 2009, J. Mach. Learn. Res..

[4]  Raef Bassily,et al.  Differentially Private Empirical Risk Minimization: Efficient Algorithms and Tight Error Bounds , 2014, 1405.7085.

[5]  Amos Beimel,et al.  Bounds on the sample complexity for private learning and private data release , 2010, Machine Learning.

[6]  DworkCynthia A firm foundation for private data analysis , 2011 .

[7]  H. Brendan McMahan,et al.  Learning Differentially Private Recurrent Language Models , 2017, ICLR.

[8]  Andrew Y. Ng,et al.  Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[9]  Mikhail Belkin,et al.  Learning privately from multiparty data , 2016, ICML.

[10]  Mihaela van der Schaar,et al.  PATE-GAN: Generating Synthetic Data with Differential Privacy Guarantees , 2018, ICLR.

[11]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[12]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[13]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[14]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[15]  Úlfar Erlingsson,et al.  Scalable Private Learning with PATE , 2018, ICLR.

[16]  Bhiksha Raj,et al.  Multiparty Differential Privacy via Aggregation of Locally Trained Classifiers , 2010, NIPS.

[17]  Catuscia Palamidessi,et al.  Geo-indistinguishability: differential privacy for location-based systems , 2012, CCS.

[18]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[20]  Zheng Yan,et al.  Mobile Phone Behavior , 2017 .

[21]  Sanjiv Kumar,et al.  cpSGD: Communication-efficient and differentially-private distributed SGD , 2018, NeurIPS.

[22]  Martín Abadi,et al.  Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data , 2016, ICLR.

[23]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[24]  Anand D. Sarwate,et al.  Stochastic gradient descent with differentially private updates , 2013, 2013 IEEE Global Conference on Signal and Information Processing.

[25]  Jianfeng Ma,et al.  PDLM: Privacy-Preserving Deep Learning Model on Cloud with Multiple Keys , 2018, IEEE Transactions on Services Computing.

[26]  Xin Wang,et al.  DeepScan: Exploiting Deep Learning for Malicious Account Detection in Location-Based Social Networks , 2018, IEEE Communications Magazine.

[27]  Sofya Raskhodnikova,et al.  Smooth sensitivity and sampling in private data analysis , 2007, STOC '07.

[28]  Rama Chellappa,et al.  Defense-GAN: Protecting Classifiers Against Adversarial Attacks Using Generative Models , 2018, ICLR.

[29]  B. Frey,et al.  Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning , 2015, Nature Biotechnology.

[30]  Assaf Schuster,et al.  Data mining with differential privacy , 2010, KDD.