Label-Only Membership Inference Attacks

Membership inference attacks are one of the simplest forms of privacy leakage for machine learning models: given a data point and model, determine whether the point was used to train the model. Existing membership inference attacks exploit models' abnormal confidence when queried on their training data. These attacks do not apply if the adversary only gets access to models' predicted labels, without a confidence measure. In this paper, we introduce label-only membership inference attacks. Instead of relying on confidence scores, our attacks evaluate the robustness of a model's predicted labels under perturbations to obtain a fine-grained membership signal. These perturbations include common data augmentations or adversarial examples. We empirically show that our label-only membership inference attacks perform on par with prior attacks that required access to model confidences. We further demonstrate that label-only attacks break multiple defenses against membership inference attacks that (implicitly or explicitly) rely on a phenomenon we call confidence masking. These defenses modify a model's confidence scores in order to thwart attacks, but leave the model's predicted labels unchanged. Our label-only attacks demonstrate that confidence-masking is not a viable defense strategy against membership inference. Finally, we investigate worst-case label-only attacks, that infer membership for a small number of outlier data points. We show that label-only attacks also match confidence-based attacks in this setting. We find that training models with differential privacy and (strong) L2 regularization are the only known defense strategies that successfully prevents all attacks. This remains true even when the differential privacy budget is too high to offer meaningful provable guarantees.

[1]  Fan Zhang,et al.  Defending Model Inversion and Membership Inference Attacks via Prediction Purification , 2020, ArXiv.

[2]  Emiliano De Cristofaro,et al.  Knock Knock, Who's There? Membership Inference on Aggregate Location Data , 2017, NDSS.

[3]  Kamalika Chaudhuri,et al.  Privacy-preserving logistic regression , 2008, NIPS.

[4]  Taghi M. Khoshgoftaar,et al.  A survey on Image Data Augmentation for Deep Learning , 2019, Journal of Big Data.

[5]  Sarvar Patel,et al.  Practical Secure Aggregation for Privacy-Preserving Machine Learning , 2017, IACR Cryptol. ePrint Arch..

[6]  Germain Forestier,et al.  Data augmentation using synthetic data for time series classification with deep residual networks , 2018, ArXiv.

[7]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[8]  Ananthram Swami,et al.  Practical Black-Box Attacks against Machine Learning , 2016, AsiaCCS.

[9]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[10]  Kai Chen,et al.  Understanding Membership Inferences on Well-Generalized Learning Models , 2018, ArXiv.

[11]  Yarin Gal,et al.  Uncertainty in Deep Learning , 2016 .

[12]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[13]  Andrew L. Maas Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .

[14]  Nic Ford,et al.  Adversarial Examples Are a Natural Consequence of Test Error in Noise , 2019, ICML.

[15]  Shai Ben-David,et al.  Understanding Machine Learning - From Theory to Algorithms , 2014 .

[16]  Robert A. Jenders,et al.  A systematic literature review of automated clinical coding and classification systems , 2010, J. Am. Medical Informatics Assoc..

[17]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[18]  Dimitrios I. Fotiadis,et al.  Machine learning applications in cancer prognosis and prediction , 2014, Computational and structural biotechnology journal.

[19]  Vitaly Shmatikov,et al.  Membership Inference Attacks Against Machine Learning Models , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[20]  Yong Hu,et al.  The application of data mining techniques in financial fraud detection: A classification framework and an academic review of literature , 2011, Decis. Support Syst..

[21]  Sung Wook Baik,et al.  Multi-grade brain tumor classification using deep CNN with extensive data augmentation , 2019, J. Comput. Sci..

[22]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[23]  Michael Backes,et al.  MemGuard: Defending against Black-Box Membership Inference Attacks via Adversarial Examples , 2019, CCS.

[24]  Jeffrey F. Naughton,et al.  Bolt-on Differential Privacy for Scalable Stochastic Gradient Descent-based Analytics , 2016, SIGMOD Conference.

[25]  Chao Yang,et al.  A Survey on Deep Transfer Learning , 2018, ICANN.

[26]  Somesh Jha,et al.  Privacy Risk in Machine Learning: Analyzing the Connection to Overfitting , 2017, 2018 IEEE 31st Computer Security Foundations Symposium (CSF).

[27]  Prateek Mittal,et al.  Privacy Risks of Securing Machine Learning Models against Adversarial Examples , 2019, CCS.

[28]  Mario Fritz,et al.  ML-Leaks: Model and Data Independent Membership Inference Attacks and Defenses on Machine Learning Models , 2018, NDSS.

[29]  Binghui Wang,et al.  Stealing Hyperparameters in Machine Learning , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[30]  Emiliano De Cristofaro,et al.  LOGAN: Membership Inference Attacks Against Generative Models , 2017, Proc. Priv. Enhancing Technol..

[31]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Úlfar Erlingsson,et al.  The Secret Sharer: Evaluating and Testing Unintended Memorization in Neural Networks , 2018, USENIX Security Symposium.

[33]  Quoc V. Le,et al.  AutoAugment: Learning Augmentation Policies from Data , 2018, ArXiv.

[34]  R. Law,et al.  Hospitality and Tourism Online Reviews: Recent Trends and Future Directions , 2015 .

[35]  Matthias Bethge,et al.  Decision-Based Adversarial Attacks: Reliable Attacks Against Black-Box Machine Learning Models , 2017, ICLR.

[36]  Yiming Yang,et al.  Transformer-XL: Attentive Language Models beyond a Fixed-Length Context , 2019, ACL.

[37]  Michał Grochowski,et al.  Data augmentation for improving deep learning in image classification problem , 2018, 2018 International Interdisciplinary PhD Workshop (IIPhDW).

[38]  Fan Zhang,et al.  Stealing Machine Learning Models via Prediction APIs , 2016, USENIX Security Symposium.

[39]  Matt Fredrikson,et al.  Stolen Memories: Leveraging Model Memorization for Calibrated White-Box Membership Inference , 2020, USENIX Security Symposium.

[40]  David Berthelot,et al.  FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence , 2020, NeurIPS.

[41]  Ling Liu,et al.  Towards Demystifying Membership Inference Attacks , 2018, ArXiv.

[42]  Lingxiao Wang,et al.  Revisiting Membership Inference Under Realistic Assumptions , 2020, Proc. Priv. Enhancing Technol..

[43]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[44]  Kevin P. Murphy,et al.  Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[45]  Xiaodong Cui,et al.  Data Augmentation for Deep Neural Network Acoustic Modeling , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[46]  Geoff S. Nitschke,et al.  Improving Deep Learning with Generic Data Augmentation , 2018, 2018 IEEE Symposium Series on Computational Intelligence (SSCI).

[47]  Graham W. Taylor,et al.  Improved Regularization of Convolutional Neural Networks with Cutout , 2017, ArXiv.

[48]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[49]  Reza Shokri,et al.  Machine Learning with Membership Privacy using Adversarial Regularization , 2018, CCS.

[50]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[51]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[52]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[53]  Michael I. Jordan,et al.  HopSkipJumpAttack: A Query-Efficient Decision-Based Attack , 2019, 2020 IEEE Symposium on Security and Privacy (SP).

[54]  Carl A. Gunter,et al.  Towards Measuring Membership Privacy , 2017, ArXiv.

[55]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[56]  Luis Perez,et al.  The Effectiveness of Data Augmentation in Image Classification using Deep Learning , 2017, ArXiv.