Defending Model Inversion and Membership Inference Attacks via Prediction Purification

Neural networks are susceptible to data inference attacks such as the model inversion attack and the membership inference attack, where the attacker could infer the reconstruction and the membership of a data sample from the confidence scores predicted by the target classifier. In this paper, we propose a unified approach, namely purification framework, to defend data inference attacks. It purifies the confidence score vectors predicted by the target classifier by reducing their dispersion. The purifier can be further specialized in defending a particular attack via adversarial learning. We evaluate our approach on benchmark datasets and classifiers. We show that when the purifier is dedicated to one attack, it naturally defends the other one, which empirically demonstrates the connection between the two attacks. The purifier can effectively defend both attacks. For example, it can reduce the membership inference accuracy by up to 15% and increase the model inversion error by a factor of up to 4. Besides, it incurs less than 0.4% classification accuracy drop and less than 5.5% distortion to the confidence scores.

[1]  Vitaly Shmatikov,et al.  Membership Inference Attacks Against Machine Learning Models , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[2]  Vitaly Shmatikov,et al.  Privacy-preserving deep learning , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[3]  Pierre Baldi,et al.  Autoencoders, Unsupervised Learning, and Deep Architectures , 2011, ICML Unsupervised and Transfer Learning.

[4]  Michael Backes,et al.  MemGuard: Defending against Black-Box Membership Inference Attacks via Adversarial Examples , 2019, CCS.

[5]  Emiliano De Cristofaro,et al.  Knock Knock, Who's There? Membership Inference on Aggregate Location Data , 2017, NDSS.

[6]  Seong Joon Oh,et al.  Towards Reverse-Engineering Black-Box Neural Networks , 2017, ICLR.

[7]  Vitaly Feldman,et al.  Privacy-preserving Prediction , 2018, COLT.

[8]  Rich Caruana,et al.  Overfitting in Neural Nets: Backpropagation, Conjugate Gradient, and Early Stopping , 2000, NIPS.

[9]  Somesh Jha,et al.  Privacy Risk in Machine Learning: Analyzing the Connection to Overfitting , 2017, 2018 IEEE 31st Computer Security Foundations Symposium (CSF).

[10]  Michael Naehrig,et al.  CryptoNets: applying neural networks to encrypted data with high throughput and accuracy , 2016, ICML 2016.

[11]  Calton Pu,et al.  Differentially Private Model Publishing for Deep Learning , 2019, 2019 IEEE Symposium on Security and Privacy (SP).

[12]  Yang Zhang,et al.  MBeacon: Privacy-Preserving Beacons for DNA Methylation Data , 2019, NDSS.

[13]  Anders Krogh,et al.  Learning with ensembles: How overfitting can be useful , 1995, NIPS.

[14]  Somesh Jha,et al.  Model Inversion Attacks that Exploit Confidence Information and Basic Countermeasures , 2015, CCS.

[15]  Kai Chen,et al.  Understanding Membership Inferences on Well-Generalized Learning Models , 2018, ArXiv.

[16]  Di Wang,et al.  Differentially Private Empirical Risk Minimization Revisited: Faster and More General , 2018, NIPS.

[17]  D. Opitz,et al.  Popular Ensemble Methods: An Empirical Study , 1999, J. Artif. Intell. Res..

[18]  Fabio Roli,et al.  Why Do Adversarial Attacks Transfer? Explaining Transferability of Evasion and Poisoning Attacks , 2018, USENIX Security Symposium.

[19]  R. Polikar,et al.  Ensemble based systems in decision making , 2006, IEEE Circuits and Systems Magazine.

[20]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[21]  Sarvar Patel,et al.  Practical Secure Aggregation for Privacy-Preserving Machine Learning , 2017, IACR Cryptol. ePrint Arch..

[22]  Yao Lu,et al.  Oblivious Neural Network Predictions via MiniONN Transformations , 2017, IACR Cryptol. ePrint Arch..

[23]  Shiho Moriai,et al.  Privacy-Preserving Deep Learning via Additively Homomorphic Encryption , 2018, IEEE Transactions on Information Forensics and Security.

[24]  Daniel Kifer,et al.  Private Convex Empirical Risk Minimization and High-dimensional Regression , 2012, COLT 2012.

[25]  S. Nelson,et al.  Resolving Individuals Contributing Trace Amounts of DNA to Highly Complex Mixtures Using High-Density SNP Genotyping Microarrays , 2008, PLoS genetics.

[26]  Shin Ishii,et al.  Virtual Adversarial Training: A Regularization Method for Supervised and Semi-Supervised Learning , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Payman Mohassel,et al.  SecureML: A System for Scalable Privacy-Preserving Machine Learning , 2017, 2017 IEEE Symposium on Security and Privacy (SP).

[28]  Tribhuvanesh Orekondy,et al.  Knockoff Nets: Stealing Functionality of Black-Box Models , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[31]  Reza Shokri,et al.  Machine Learning with Membership Privacy using Adversarial Regularization , 2018, CCS.

[32]  Fan Zhang,et al.  Stealing Machine Learning Models via Prediction APIs , 2016, USENIX Security Symposium.

[33]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[34]  Matt Fredrikson,et al.  Stolen Memories: Leveraging Model Memorization for Calibrated White-Box Membership Inference , 2020, USENIX Security Symposium.

[35]  Dawn Song,et al.  The Secret Revealer: Generative Model-Inversion Attacks Against Deep Neural Networks , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[37]  Aaron C. Courville,et al.  Adversarially Learned Inference , 2016, ICLR.

[38]  Anantha Chandrakasan,et al.  Gazelle: A Low Latency Framework for Secure Neural Network Inference , 2018, IACR Cryptol. ePrint Arch..

[39]  Amir Houmansadr,et al.  Comprehensive Privacy Analysis of Deep Learning: Passive and Active White-box Inference Attacks against Centralized and Federated Learning , 2018, 2019 IEEE Symposium on Security and Privacy (SP).

[40]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[41]  Somesh Jha,et al.  Privacy in Pharmacogenetics: An End-to-End Case Study of Personalized Warfarin Dosing , 2014, USENIX Security Symposium.

[42]  Michael Backes,et al.  Membership Privacy in MicroRNA-based Studies , 2016, CCS.

[43]  Vitaly Shmatikov,et al.  Exploiting Unintended Feature Leakage in Collaborative Learning , 2018, 2019 IEEE Symposium on Security and Privacy (SP).

[44]  Patrick D. McDaniel,et al.  Transferability in Machine Learning: from Phenomena to Black-Box Attacks using Adversarial Samples , 2016, ArXiv.

[45]  Anand D. Sarwate,et al.  Differentially Private Empirical Risk Minimization , 2009, J. Mach. Learn. Res..

[46]  Sebastian Nowozin,et al.  Oblivious Multi-Party Machine Learning on Trusted Processors , 2016, USENIX Security Symposium.

[47]  Carl A. Gunter,et al.  Towards Measuring Membership Privacy , 2017, ArXiv.

[48]  Raef Bassily,et al.  Differentially Private Empirical Risk Minimization: Efficient Algorithms and Tight Error Bounds , 2014, 1405.7085.

[49]  Goichiro Hanaoka,et al.  Model Inversion Attacks for Prediction Systems: Without Knowledge of Non-Sensitive Attributes , 2017, 2017 15th Annual Conference on Privacy, Security and Trust (PST).

[50]  Ninghui Li,et al.  Membership Inference Attacks and Defenses in Supervised Learning via Generalization Gap , 2020, ArXiv.

[51]  Bo Luo,et al.  I Know What You See: Power Side-Channel Attack on Convolutional Neural Network Accelerators , 2018, ACSAC.

[52]  Ananthram Swami,et al.  Practical Black-Box Attacks against Machine Learning , 2016, AsiaCCS.

[53]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[54]  Ming-Hsuan Yang,et al.  Adversarial Learning of Privacy-Preserving and Task-Oriented Representations , 2019, AAAI.

[55]  Dawn Song,et al.  Towards Practical Differentially Private Convex Optimization , 2019, 2019 IEEE Symposium on Security and Privacy (SP).

[56]  Jinyuan Jia,et al.  AttriGuard: A Practical Defense Against Attribute Inference Attacks via Adversarial Machine Learning , 2018, USENIX Security Symposium.

[57]  Yang Zhang,et al.  Updates-Leak: Data Set Inference and Reconstruction Attacks in Online Learning , 2019, USENIX Security Symposium.

[58]  Jeffrey F. Naughton,et al.  A Methodology for Formalizing Model-Inversion Attacks , 2016, 2016 IEEE 29th Computer Security Foundations Symposium (CSF).

[59]  Mario Fritz,et al.  ML-Leaks: Model and Data Independent Membership Inference Attacks and Defenses on Machine Learning Models , 2018, NDSS.

[60]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[61]  Binghui Wang,et al.  Stealing Hyperparameters in Machine Learning , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[62]  Anand D. Sarwate,et al.  Stochastic gradient descent with differentially private updates , 2013, 2013 IEEE Global Conference on Signal and Information Processing.

[63]  Zhenkai Liang,et al.  Neural Network Inversion in Adversarial Setting via Background Knowledge Alignment , 2019, CCS.

[64]  Emiliano De Cristofaro,et al.  LOGAN: Membership Inference Attacks Against Generative Models , 2017, Proc. Priv. Enhancing Technol..

[65]  Giovanni Felici,et al.  Hacking smart machines with smarter ones: How to extract meaningful data from machine learning classifiers , 2013, Int. J. Secur. Networks.

[66]  Giuseppe Ateniese,et al.  Deep Models Under the GAN: Information Leakage from Collaborative Deep Learning , 2017, CCS.