Membership Inference Attacks From First Principles

A membership inference attack allows an adversary to query a trained machine learning model to predict whether or not a particular example was contained in the model’s training dataset. These attacks are currently evaluated using average-case “accuracy” metrics that fail to characterize whether the attack can confidently identify any members of the training set. We argue that attacks should instead be evaluated by computing their true-positive rate at low (e.g., ≤ 0.1%) false-positive rates, and find most prior attacks perform poorly when evaluated in this way. To address this we develop a Likelihood Ratio Attack (LiRA) that carefully combines multiple ideas from the literature. Our attack is $10\times$ more powerful at low false-positive rates, and also strictly dominates prior attacks on existing metrics.

[1]  R. Shokri,et al.  Enhanced Membership Inference Attacks against Machine Learning Models , 2021, CCS.

[2]  Graham Cormode,et al.  On the Importance of Difficulty Calibration in Membership Inference Attacks , 2021, ICLR.

[3]  Samy Bengio,et al.  Understanding deep learning (still) requires rethinking generalization , 2021, Commun. ACM.

[4]  Emiliano De Cristofaro,et al.  ML-Doctor: Holistic Risk Assessment of Inference Attacks Against Machine Learning Models , 2021, USENIX Security Symposium.

[5]  Milad Nasr,et al.  Adversary Instantiation: Lower Bounds for Differentially Private Machine Learning , 2021, 2021 IEEE Symposium on Security and Privacy (SP).

[6]  Colin Raffel,et al.  Extracting Training Data from Large Language Models , 2020, USENIX Security Symposium.

[7]  Vitaly Feldman,et al.  When is memorization of irrelevant training data necessary for high-accuracy learning? , 2020, STOC.

[8]  Carl A. Gunter,et al.  A Pragmatic Approach to Membership Inferences on Machine Learning Models , 2020, 2020 IEEE European Symposium on Security and Privacy (EuroS&P).

[9]  Tribhuvanesh Orekondy,et al.  Sampling Attacks: Amplification of Membership Inference Attacks by Repeated Queries , 2020, ArXiv.

[10]  Vitaly Feldman,et al.  What Neural Networks Memorize and Why: Discovering the Long Tail via Influence Estimation , 2020, NeurIPS.

[11]  Yang Zhang,et al.  Membership Leakage in Label-Only Exposures , 2020, CCS.

[12]  Nicolas Papernot,et al.  Label-Only Membership Inference Attacks , 2020, ICML.

[13]  Reza Shokri,et al.  ML Privacy Meter: Aiding Regulatory Compliance by Quantifying the Privacy Risks of Machine Learning , 2020, ArXiv.

[14]  Jonathan Ullman,et al.  Auditing Differentially Private Machine Learning: How Private is Private SGD? , 2020, NeurIPS.

[15]  Lingxiao Wang,et al.  Revisiting Membership Inference Under Realistic Assumptions , 2020, Proc. Priv. Enhancing Technol..

[16]  Michael Backes,et al.  Stealing Links from Graph Neural Networks , 2020, USENIX Security Symposium.

[17]  Liwei Song,et al.  Systematic Evaluation of Privacy Risks of Machine Learning Models , 2020, USENIX Security Symposium.

[18]  N. Gong,et al.  MemGuard: Defending against Black-Box Membership Inference Attacks via Adversarial Examples , 2019, CCS.

[19]  Matt Fredrikson,et al.  Stolen Memories: Leveraging Model Memorization for Calibrated White-Box Membership Inference , 2019, USENIX Security Symposium.

[20]  Vitaly Feldman,et al.  Does learning require memorization? a short tale about a long tail , 2019, STOC.

[21]  Reza Shokri,et al.  Ultimate Power of Inference Attacks: Privacy Risks of Learning High-Dimensional Graphical Models , 2019 .

[22]  Cordelia Schmid,et al.  White-box vs Black-box: Bayes Optimal Strategies for Membership Inference , 2019, ICML.

[23]  Prateek Mittal,et al.  Privacy Risks of Securing Machine Learning Models against Adversarial Examples , 2019, CCS.

[24]  Andrew M. Dai,et al.  Gmail Smart Compose: Real-Time Assisted Writing , 2019, KDD.

[25]  Yang Zhang,et al.  Updates-Leak: Data Set Inference and Reconstruction Attacks in Online Learning , 2019, USENIX Security Symposium.

[26]  Amir Houmansadr,et al.  Comprehensive Privacy Analysis of Deep Learning: Passive and Active White-box Inference Attacks against Centralized and Federated Learning , 2018, 2019 IEEE Symposium on Security and Privacy (SP).

[27]  Nikita Borisov,et al.  Property Inference Attacks on Fully Connected Neural Networks using Permutation Invariant Representations , 2018, CCS.

[28]  Amos J. Storkey,et al.  School of Informatics, University of Edinburgh , 2022 .

[29]  Ling Liu,et al.  Towards Demystifying Membership Inference Attacks , 2018, ArXiv.

[30]  Mario Fritz,et al.  ML-Leaks: Model and Data Independent Membership Inference Attacks and Defenses on Machine Learning Models , 2018, NDSS.

[31]  Quoc V. Le,et al.  AutoAugment: Learning Augmentation Policies from Data , 2018, ArXiv.

[32]  Vitaly Shmatikov,et al.  Exploiting Unintended Feature Leakage in Collaborative Learning , 2018, 2019 IEEE Symposium on Security and Privacy (SP).

[33]  Úlfar Erlingsson,et al.  The Secret Sharer: Evaluating and Testing Unintended Memorization in Neural Networks , 2018, USENIX Security Symposium.

[34]  Úlfar Erlingsson,et al.  Scalable Private Learning with PATE , 2018, ICLR.

[35]  Binghui Wang,et al.  Stealing Hyperparameters in Machine Learning , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[36]  Reza Shokri,et al.  Machine Learning with Membership Privacy using Adversarial Regularization , 2018, CCS.

[37]  Carl A. Gunter,et al.  Towards Measuring Membership Privacy , 2017, ArXiv.

[38]  Somesh Jha,et al.  Privacy Risk in Machine Learning: Analyzing the Connection to Overfitting , 2017, 2018 IEEE 31st Computer Security Foundations Symposium (CSF).

[39]  Emiliano De Cristofaro,et al.  Knock Knock, Who's There? Membership Inference on Aggregate Location Data , 2017, NDSS.

[40]  Yi Yang,et al.  Random Erasing Data Augmentation , 2017, AAAI.

[41]  Emiliano De Cristofaro,et al.  LOGAN: Membership Inference Attacks Against Generative Models , 2017, Proc. Priv. Enhancing Technol..

[42]  C. Dwork,et al.  Exposed! A Survey of Attacks on Private Data , 2017, Annual Review of Statistics and Its Application.

[43]  Sebastian Thrun,et al.  Dermatologist-level classification of skin cancer with deep neural networks , 2017, Nature.

[44]  Vitaly Shmatikov,et al.  Membership Inference Attacks Against Machine Learning Models , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[45]  Richard Socher,et al.  Pointer Sentinel Mixture Models , 2016, ICLR.

[46]  Frank Hutter,et al.  SGDR: Stochastic Gradient Descent with Warm Restarts , 2016, ICLR.

[47]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[48]  Nikos Komodakis,et al.  Wide Residual Networks , 2016, BMVC.

[49]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[50]  Thomas Steinke,et al.  Robust Traceability from Trace Amounts , 2015, 2015 IEEE 56th Annual Symposium on Foundations of Computer Science.

[51]  Michael Carl Tschantz,et al.  Better Malware Ground Truth: Techniques for Weighting Anti-Virus Vendor Labels , 2015, AISec@CCS.

[52]  Somesh Jha,et al.  Model Inversion Attacks that Exploit Confidence Information and Basic Countermeasures , 2015, CCS.

[53]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[54]  Anand D. Sarwate,et al.  Stochastic gradient descent with differentially private updates , 2013, 2013 IEEE Global Conference on Signal and Information Processing.

[55]  Pramod Viswanath,et al.  The Composition Theorem for Differential Privacy , 2013, IEEE Transactions on Information Theory.

[56]  Michael I. Jordan,et al.  Genomic privacy and limits of individual detection in a pool , 2009, Nature Genetics.

[57]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[58]  L. Wasserman,et al.  A Statistical Framework for Differential Privacy , 2008, 0811.2501.

[59]  S. Nelson,et al.  Resolving Individuals Contributing Trace Amounts of DNA to Highly Complex Mixtures Using High-Density SNP Genotyping Microarrays , 2008, PLoS genetics.

[60]  Marcus A. Maloof,et al.  Learning to Detect and Classify Malicious Executables in the Wild , 2006, J. Mach. Learn. Res..

[61]  Xiao-Li Meng,et al.  The Art of Data Augmentation , 2001 .

[62]  Anders Krogh,et al.  A Simple Weight Decay Can Improve Generalization , 1991, NIPS.

[63]  Robert A. Jacobs,et al.  Increased rates of convergence through learning rate adaptation , 1987, Neural Networks.

[64]  Zbigniew Leonowicz,et al.  Dermatologist-Level Classification of Skin Cancer Using Cascaded Ensembling of Convolutional Neural Network and Handcrafted Features Based Deep Neural Network , 2022, IEEE Access.

[65]  R. Shokri,et al.  Quantifying the Privacy Risks of Learning High-Dimensional Graphical Models , 2021, International Conference on Artificial Intelligence and Statistics.

[66]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[67]  Emiliano De Cristofaro,et al.  : Membership Inference Attacks Against Generative Models , 2018 .

[68]  Robert Laganière,et al.  Membership Inference Attack against Differentially Private Deep Learning Model , 2018, Trans. Data Priv..

[69]  David A. Wagner,et al.  Detecting Credential Spearphishing in Enterprise Settings , 2017, USENIX Security Symposium.

[70]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[71]  Vangelis Metsis,et al.  Spam Filtering with Naive Bayes - Which Naive Bayes? , 2006, CEAS.

[72]  Jaideep Srivastava,et al.  A Comparative Study of Anomaly Detection Schemes in Network Intrusion Detection , 2003, SDM.

[73]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[74]  Patrick Pantel,et al.  SpamCop: A Spam Classification & Organisation Program , 1998, AAAI 1998.

[75]  E. S. Pearson,et al.  On the Problem of the Most Efficient Tests of Statistical Hypotheses , 1933 .