A Survey on Differentially Private Machine Learning

Recent years have witnessed remarkable successes of machine learning in various applications. However, machine learning models suffer from a potential risk of leaking private information contained in training data, which have attracted increasing research attention. As one of the mainstream privacypreserving techniques, differential privacy provides a promising way to prevent the leaking of individual-level privacy in training data while preserving the quality of training data for model building. This work provides a comprehensive survey on the existing works that incorporate differential privacy with machine learning, so-called differentially private machine learning and categorizes them into two broad categories as per different differential privacy mechanisms: the Laplace/Gaussian/exponential mechanism and the output/objective perturbation mechanism. In the former, a calibrated amount of noise is added to the non-private model and in the latter, the output or the objective function is perturbed by random noise. Particularly, the survey covers the techniques of differentially private deep learning to alleviate the recent concerns about the privacy of big data contributors. In addition, the research challenges in terms of model utility, privacy level and applications are discussed. To tackle these challenges, several potential future research directions for differentially private machine learning are pointed out.

[1]  Ju Ren,et al.  GANobfuscator: Mitigating Information Leakage Under GAN via Differential Privacy , 2019, IEEE Transactions on Information Forensics and Security.

[2]  Mihaela van der Schaar,et al.  PATE-GAN: Generating Synthetic Data with Differential Privacy Guarantees , 2018, ICLR.

[3]  Bidyut Baran Chaudhuri,et al.  Handling data irregularities in classification: Foundations, trends, and future challenges , 2018, Pattern Recognit..

[4]  Jin Li,et al.  Differentially private Naive Bayes learning over multiple data sources , 2018, Inf. Sci..

[5]  Anand D. Sarwate,et al.  Distributed Differentially Private Algorithms for Matrix and Tensor Factorization , 2018, IEEE Journal of Selected Topics in Signal Processing.

[6]  Fei Wang,et al.  Differentially Private Generative Adversarial Network , 2018, ArXiv.

[7]  Shouling Ji,et al.  Differentially Private Releasing via Deep Generative Model , 2018, ArXiv.

[8]  Diego Klabjan,et al.  Generative Adversarial Nets for Multiple Text Corpora , 2017, 2021 International Joint Conference on Neural Networks (IJCNN).

[9]  Chia-Mu Yu,et al.  POSTER: A Unified Framework of Differentially Private Synthetic Data Release with Generative Adversarial Network , 2017, CCS.

[10]  Alon Gonen,et al.  Smooth Sensitivity Based Approach for Differentially Private Principal Component Analysis , 2017, ArXiv.

[11]  Elisa Bertino,et al.  Differentially Private K-Means Clustering and a Hybrid Approach to Private Optimization , 2017, ACM Trans. Priv. Secur..

[12]  Li Zhang,et al.  Learning Differentially Private Language Models Without Losing Accuracy , 2017, ArXiv.

[13]  Dejing Dou,et al.  Adaptive Laplace Mechanism: Differential Privacy Preservation in Deep Learning , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[14]  Emiliano De Cristofaro,et al.  Differentially Private Mixture of Generative Neural Networks , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[15]  Xike Xie,et al.  Embedding differential privacy in decision tree algorithm with different depths , 2017, Science China Information Sciences.

[16]  Zhiwei Steven Wu,et al.  Privacy-Preserving Generative Deep Neural Networks Support Clinical Data Sharing , 2017, bioRxiv.

[17]  Ting Wang,et al.  Private, Yet Practical, Multiparty Deep Learning , 2017, 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS).

[18]  Seth Neel,et al.  Accuracy First: Selecting a Differential Privacy Level for Accuracy Constrained ERM , 2017, NIPS.

[19]  Xiaoqian Jiang,et al.  Partitioning-Based Mechanisms Under Personalized Differential Privacy , 2017, PAKDD.

[20]  Karan Singh,et al.  The Price of Differential Privacy for Online Learning , 2017, ICML.

[21]  Martín Abadi,et al.  Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data , 2016, ICLR.

[22]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[23]  M. Islam,et al.  Differentially Private Random Decision Forests using Smooth Sensitivity , 2016, Expert Syst. Appl..

[24]  Jorge Cortés,et al.  Differentially Private Distributed Convex Optimization via Functional Perturbation , 2015, IEEE Transactions on Control of Network Systems.

[25]  Zhihua Zhang,et al.  Wishart Mechanism for Differentially Private Principal Components Analysis , 2015, AAAI.

[26]  Svetha Venkatesh,et al.  Differentially Private Random Forest with High Utility , 2015, 2015 IEEE International Conference on Data Mining.

[27]  Vitaly Shmatikov,et al.  Privacy-preserving deep learning , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[28]  Li Xiong,et al.  Differentially Private Distributed Online Learning , 2015, ArXiv.

[29]  Kobbi Nissim,et al.  Differentially Private Release and Learning of Threshold Functions , 2015, 2015 IEEE 56th Annual Symposium on Foundations of Computer Science.

[30]  Elisa Bertino,et al.  Differentially Private K-Means Clustering , 2015, CODASPY.

[31]  Amos Beimel,et al.  Learning Privately with Labeled and Unlabeled Examples , 2014, Algorithmica.

[32]  Zhenqi Huang,et al.  Differentially Private Distributed Optimization , 2014, ICDCN.

[33]  Basit Shafiq,et al.  Differentially Private Naive Bayes Classification , 2013, 2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT).

[34]  Amos Beimel,et al.  Private Learning and Sanitization: Pure vs. Approximate Differential Privacy , 2013, APPROX-RANDOM.

[35]  Prateek Jain,et al.  Differentially Private Learning with Kernels , 2013, ICML.

[36]  Amos Beimel,et al.  Characterizing the sample complexity of private learners , 2013, ITCS '13.

[37]  Kunal Talwar,et al.  On differentially private low rank approximation , 2013, SODA.

[38]  Kamalika Chaudhuri,et al.  Near-Optimal Algorithms for Differentially-Private Principal Components , 2012, ArXiv.

[39]  Yin Yang,et al.  Functional Mechanism: Regression Analysis under Differential Privacy , 2012, Proc. VLDB Endow..

[40]  Kamalika Chaudhuri,et al.  Sample Complexity Bounds for Differentially Private Learning , 2011, COLT.

[41]  Jing Lei,et al.  Differentially Private M-Estimators , 2011, NIPS.

[42]  Assaf Schuster,et al.  Data mining with differential privacy , 2010, KDD.

[43]  Amos Beimel,et al.  Bounds on the sample complexity for private learning and private data release , 2010, Machine Learning.

[44]  Rebecca N. Wright,et al.  A Practical Differentially Private Random Decision Tree Classifier , 2009, 2009 IEEE International Conference on Data Mining Workshops.

[45]  Ling Huang,et al.  Learning in a Large Function Space: Privacy-Preserving Mechanisms for SVM Learning , 2009, J. Priv. Confidentiality.

[46]  Anand D. Sarwate,et al.  Differentially Private Empirical Risk Minimization , 2009, J. Mach. Learn. Res..

[47]  Honglak Lee,et al.  Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations , 2009, ICML '09.

[48]  Haim Kaplan,et al.  Private coresets , 2009, STOC '09.

[49]  Kamalika Chaudhuri,et al.  Privacy-preserving logistic regression , 2008, NIPS.

[50]  Aaron Roth,et al.  A learning theory approach to non-interactive database privacy , 2008, STOC.

[51]  Sofya Raskhodnikova,et al.  What Can We Learn Privately? , 2008, 2008 49th Annual IEEE Symposium on Foundations of Computer Science.

[52]  Benjamin Recht,et al.  Random Features for Large-Scale Kernel Machines , 2007, NIPS.

[53]  Adam D. Smith,et al.  Smooth sensitivity and sampling in private data analysis , 2007, STOC '07.

[54]  Cynthia Dwork,et al.  Practical privacy: the SuLQ framework , 2005, PODS.

[55]  Tao Li,et al.  Differentially private classification with decision tree ensemble , 2018, Appl. Soft Comput..

[56]  Melissa Chase,et al.  Private Collaborative Neural Network Learning , 2017, IACR Cryptol. ePrint Arch..