论文信息 - Stolen Memories: Leveraging Model Memorization for Calibrated White-Box Membership Inference

Stolen Memories: Leveraging Model Memorization for Calibrated White-Box Membership Inference

Membership inference (MI) attacks exploit the fact that machine learning algorithms sometimes leak information about their training data through the learned model. In this work, we study membership inference in the white-box setting in order to exploit the internals of a model, which have not been effectively utilized by previous work. Leveraging new insights about how overfitting occurs in deep neural networks, we show how a model's idiosyncratic use of features can provide evidence for membership to white-box attackers---even when the model's black-box behavior appears to generalize well---and demonstrate that this attack outperforms prior black-box methods. Taking the position that an effective attack should have the ability to provide confident positive inferences, we find that previous attacks do not often provide a meaningful basis for confidently inferring membership, whereas our attack can be effectively calibrated for high precision. Finally, we examine popular defenses against MI attacks, finding that (1) smaller generalization error is not sufficient to prevent attacks on real models, and (2) while small-$\epsilon$-differential privacy reduces the attack's effectiveness, this often comes at a significant cost to the model's accuracy; and for larger $\epsilon$ that are sometimes used in practice (e.g., $\epsilon=16$), the attack can achieve nearly the same accuracy as on the unprotected model.

Matt Fredrikson | Klas Leino

[1] Reza Shokri,et al. Comprehensive Privacy Analysis of Deep Learning: Stand-alone and Federated Learning under Passive and Active White-box Inference Attacks , 2018, ArXiv.

[2] Somesh Jha,et al. Privacy in Pharmacogenetics: An End-to-End Case Study of Personalized Warfarin Dosing , 2014, USENIX Security Symposium.

[3] Yoshua Bengio,et al. How transferable are features in deep neural networks? , 2014, NIPS.

[4] Haixu Tang,et al. Learning your identity and disease from research papers: information leaks in genome wide association study , 2009, CCS.

[5] Vitaly Shmatikov,et al. Membership Inference Attacks Against Machine Learning Models , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[6] Vitaly Shmatikov,et al. Privacy-preserving deep learning , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[7] Ninghui Li,et al. Membership privacy: a unifying framework for privacy definitions , 2013, CCS.

[8] Ling Huang,et al. Learning in a Large Function Space: Privacy-Preserving Mechanisms for SVM Learning , 2009, J. Priv. Confidentiality.

[9] Prateek Jain,et al. To Drop or Not to Drop: Robustness, Consistency and Differential Privacy Properties of Dropout , 2015, ArXiv.

[10] Kai Chen,et al. Understanding Membership Inferences on Well-Generalized Learning Models , 2018, ArXiv.

[11] Vitaly Shmatikov,et al. The cost of privacy: destruction of data-mining utility in anonymized data publishing , 2008, KDD.

[12] Matt Fredrikson,et al. Influence-Directed Explanations for Deep Convolutional Networks , 2018, 2018 IEEE International Test Conference (ITC).

[13] H. Brendan McMahan,et al. A General Approach to Adding Differential Privacy to Iterative Training Procedures , 2018, ArXiv.

[14] Giovanni Felici,et al. Hacking smart machines with smarter ones: How to extract meaningful data from machine learning classifiers , 2013, Int. J. Secur. Networks.

[15] Ankur Taly,et al. Axiomatic Attribution for Deep Networks , 2017, ICML.

[16] Giuseppe Ateniese,et al. Deep Models Under the GAN: Information Leakage from Collaborative Deep Learning , 2017, CCS.

[17] Kevin P. Murphy,et al. Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[18] S. Nelson,et al. Resolving Individuals Contributing Trace Amounts of DNA to Highly Complex Mixtures Using High-Density SNP Genotyping Microarrays , 2008, PLoS genetics.