Label-Leaks: Membership Inference Attack with Label

Machine learning (ML) has made tremendous progress during the past decade and ML models have been deployed in many real-world applications. However, recent research has shown that ML models are vulnerable to attacks against their underlying training data. One major attack in this field is membership inference the goal of which is to determine whether a data sample is part of the training set of a target machine learning model. So far, most of the membership inference attacks against ML classifiers leverage the posteriors returned by the target model as their input. However, empirical results show that these attacks can be easily mitigated if the target model only returns the predicted label instead of posteriors. In this paper, we perform a systematic investigation of membership inference attack when the target model only provides the predicted label. We name our attack label-only membership inference attack. We focus on two adversarial settings and propose different attacks, namely transfer-based attack and perturbation based attack. The transfer-based attack follows the intuition that if a locally established shadow model is similar enough to the target model, then the adversary can leverage the shadow model's information to predict a target sample's membership. The perturbation-based attack relies on adversarial perturbation techniques to modify the target sample to a different class and uses the magnitude of the perturbation to judge whether it is a member or not. This is based on the intuition that a member sample is harder to be perturbed to a different class than a non-member sample. Extensive experiments over 6 different datasets demonstrate that both of our attacks achieve strong performance. This further demonstrates the severity of membership privacy risks of machine learning models.

[1]  Emiliano De Cristofaro,et al.  Knock Knock, Who's There? Membership Inference on Aggregate Location Data , 2017, NDSS.

[2]  Fan Zhang,et al.  Stealing Machine Learning Models via Prediction APIs , 2016, USENIX Security Symposium.

[3]  Matt Fredrikson,et al.  Stolen Memories: Leveraging Model Memorization for Calibrated White-Box Membership Inference , 2020, USENIX Security Symposium.

[4]  Ankur P. Parikh,et al.  Thieves on Sesame Street! Model Extraction of BERT-based APIs , 2019, ICLR.

[5]  Cordelia Schmid,et al.  White-box vs Black-box: Bayes Optimal Strategies for Membership Inference , 2019, ICML.

[6]  Matthias Bethge,et al.  Decision-Based Adversarial Attacks: Reliable Attacks Against Black-Box Machine Learning Models , 2017, ICLR.

[7]  J. Zico Kolter,et al.  Certified Adversarial Robustness via Randomized Smoothing , 2019, ICML.

[8]  Michael Backes,et al.  BadNL: Backdoor Attacks Against NLP Models , 2020, ArXiv.

[9]  Vitaly Shmatikov,et al.  Auditing Data Provenance in Text-Generation Models , 2018, KDD.

[10]  Michael P. Wellman,et al.  Towards the Science of Security and Privacy in Machine Learning , 2016, ArXiv.

[11]  Tribhuvanesh Orekondy,et al.  Differential Privacy Defenses and Sampling Attacks for Membership Inference , 2021, AISec@CCS.

[12]  Somesh Jha,et al.  Model Inversion Attacks that Exploit Confidence Information and Basic Countermeasures , 2015, CCS.

[13]  Tudor Dumitras,et al.  Poison Frogs! Targeted Clean-Label Poisoning Attacks on Neural Networks , 2018, NeurIPS.

[14]  Yang Zhang,et al.  walk2friends: Inferring Social Links from Mobility Profiles , 2017, CCS.

[15]  Mi Zhang,et al.  Privacy Risks of General-Purpose Language Models , 2020, 2020 IEEE Symposium on Security and Privacy (SP).

[16]  Pradeep Ravikumar,et al.  MACER: Attack-free and Scalable Robust Training via Maximizing Certified Radius , 2020, ICLR.

[17]  Dan Boneh,et al.  Ensemble Adversarial Training: Attacks and Defenses , 2017, ICLR.

[18]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[19]  Fabio Roli,et al.  Evasion Attacks against Machine Learning at Test Time , 2013, ECML/PKDD.

[20]  Brendan Dolan-Gavitt,et al.  BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain , 2017, ArXiv.

[21]  Amir Houmansadr,et al.  Comprehensive Privacy Analysis of Deep Learning: Passive and Active White-box Inference Attacks against Centralized and Federated Learning , 2018, 2019 IEEE Symposium on Security and Privacy (SP).

[22]  Reza Shokri,et al.  Machine Learning with Membership Privacy using Adversarial Regularization , 2018, CCS.

[23]  Michael P. Wellman,et al.  SoK: Security and Privacy in Machine Learning , 2018, 2018 IEEE European Symposium on Security and Privacy (EuroS&P).

[24]  S. Nelson,et al.  Resolving Individuals Contributing Trace Amounts of DNA to Highly Complex Mixtures Using High-Density SNP Genotyping Microarrays , 2008, PLoS genetics.

[25]  Somesh Jha,et al.  Privacy in Pharmacogenetics: An End-to-End Case Study of Personalized Warfarin Dosing , 2014, USENIX Security Symposium.

[26]  Michael Backes,et al.  Membership Privacy in MicroRNA-based Studies , 2016, CCS.

[27]  Michael Backes,et al.  When Machine Unlearning Jeopardizes Privacy , 2020, ArXiv.

[28]  Vitaly Shmatikov,et al.  Membership Inference Attacks Against Machine Learning Models , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[29]  Michael Backes,et al.  Dynamic Backdoor Attacks Against Machine Learning Models , 2020, ArXiv.

[30]  Mario Fritz,et al.  GAN-Leaks: A Taxonomy of Membership Inference Attacks against GANs , 2019, ArXiv.

[31]  D. Halbe,et al.  “Who’s there?” , 2012 .

[32]  Yang Zhang,et al.  Updates-Leak: Data Set Inference and Reconstruction Attacks in Online Learning , 2019, USENIX Security Symposium.

[33]  Fan Zhang,et al.  Defending Model Inversion and Membership Inference Attacks via Prediction Purification , 2020, ArXiv.

[34]  Emiliano De Cristofaro,et al.  LOGAN: Evaluating Privacy Leakage of Generative Models Using Generative Adversarial Networks , 2017, ArXiv.

[35]  Shanqing Guo,et al.  How to prove your model belongs to you: a blind-watermark based framework to protect intellectual property of DNN , 2019, ACSAC.

[36]  Nicolas Papernot,et al.  Label-Only Membership Inference Attacks , 2020, ICML.

[37]  Úlfar Erlingsson,et al.  The Secret Sharer: Evaluating and Testing Unintended Memorization in Neural Networks , 2018, USENIX Security Symposium.

[38]  Ben Y. Zhao,et al.  Neural Cleanse: Identifying and Mitigating Backdoor Attacks in Neural Networks , 2019, 2019 IEEE Symposium on Security and Privacy (SP).

[39]  Vitaly Shmatikov,et al.  Exploiting Unintended Feature Leakage in Collaborative Learning , 2018, 2019 IEEE Symposium on Security and Privacy (SP).

[40]  Michael Backes,et al.  MemGuard: Defending against Black-Box Membership Inference Attacks via Adversarial Examples , 2019, CCS.

[41]  Nikita Borisov,et al.  Property Inference Attacks on Fully Connected Neural Networks using Permutation Invariant Representations , 2018, CCS.

[42]  Seong Joon Oh,et al.  Towards Reverse-Engineering Black-Box Neural Networks , 2017, ICLR.

[43]  Bernt Schiele,et al.  Segmentations-Leak: Membership Inference Attacks and Defenses in Semantic Image Segmentation , 2020, ECCV.

[44]  Somesh Jha,et al.  Privacy Risk in Machine Learning: Analyzing the Connection to Overfitting , 2017, 2018 IEEE 31st Computer Security Foundations Symposium (CSF).

[45]  Ananthram Swami,et al.  The Limitations of Deep Learning in Adversarial Settings , 2015, 2016 IEEE European Symposium on Security and Privacy (EuroS&P).

[46]  Fan Yang,et al.  An Embarrassingly Simple Approach for Trojan Attack in Deep Neural Networks , 2020, KDD.

[47]  Ananthram Swami,et al.  Practical Black-Box Attacks against Machine Learning , 2016, AsiaCCS.

[48]  Michael Backes,et al.  Stealing Links from Graph Neural Networks , 2020, USENIX Security Symposium.

[49]  Wen-Chuan Lee,et al.  Trojaning Attack on Neural Networks , 2018, NDSS.

[50]  Yang Zhang,et al.  MBeacon: Privacy-Preserving Beacons for DNA Methylation Data , 2019, NDSS.

[51]  Kai Chen,et al.  Understanding Membership Inferences on Well-Generalized Learning Models , 2018, ArXiv.

[52]  Mario Fritz,et al.  ML-Leaks: Model and Data Independent Membership Inference Attacks and Defenses on Machine Learning Models , 2018, NDSS.

[53]  Binghui Wang,et al.  Stealing Hyperparameters in Machine Learning , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[54]  Inderjit S. Dhillon,et al.  Towards Fast Computation of Certified Robustness for ReLU Networks , 2018, ICML.