论文信息 - LOGAN: Evaluating Privacy Leakage of Generative Models Using Generative Adversarial Networks

LOGAN: Evaluating Privacy Leakage of Generative Models Using Generative Adversarial Networks

Generative models are increasingly used to artificially generate various kinds of data, including high-quality images and videos. These models are used to estimate the underlying distribution of a dataset and randomly generate realistic samples according to their estimated distribution. However, the data used to train these models is often sensitive, thus prompting the need to evaluate information leakage from producing synthetic samples with generative models---specifically, whether an adversary can infer information about the data used to train the models. In this paper, we present the first membership inference attack on generative models. To mount the attack, we train a Generative Adversarial Network (GAN), which combines a discriminative and a generative model, to detect overfitting and recognize inputs that are part of training datasets by relying on the discriminator's capacity to learn statistical differences in distributions. We present attacks based on both white-box and black-box access to the target model, and show how to improve the latter using limited auxiliary knowledge of dataset samples. We test our attacks on several state-of-the-art models, such as Deep Convolutional GAN (DCGAN), Boundary Equilibrium GAN (BEGAN), and the combination of DCGAN with a Variational Autoencoder (DCGAN+VAE), using datasets consisting of complex representations of faces (LFW), objects (CIFAR-10), and medical images (Diabetic Retinopathy). The white-box attacks are 100% successful at inferring which samples were used to train the target model, and the black-box ones succeeds with 80% accuracy. Finally, we discuss the sensitivity of our attacks to different training parameters, and their robustness against mitigation strategies, finding that successful defenses often result in significant worse performances of the generative models in terms of training stability and/or sample quality.

[1] Lucas Theis,et al. Lossy Image Compression with Compressive Autoencoders , 2017, ICLR.

[2] Vitaly Shmatikov,et al. Membership Inference Attacks Against Machine Learning Models , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[3] Cynthia Dwork,et al. Differential Privacy: A Survey of Results , 2008, TAMC.

[4] Xiang-Yang Li,et al. De-anonymizing social networks and inferring private attributes using knowledge graphs , 2016, IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications.

[5] Vitaly Shmatikov,et al. Privacy-preserving deep learning , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[6] Ananthram Swami,et al. Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks , 2015, 2016 IEEE Symposium on Security and Privacy (SP).

[7] Xi Zhang,et al. Automated Inference on Criminality using Face Images , 2016, ArXiv.

[8] Giovanni Felici,et al. Hacking smart machines with smarter ones: How to extract meaningful data from machine learning classifiers , 2013, Int. J. Secur. Networks.

[9] Giuseppe Ateniese,et al. Deep Models Under the GAN: Information Leakage from Collaborative Deep Learning , 2017, CCS.

[10] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[11] Pedro Costa,et al. Towards Adversarial Retinal Image Synthesis , 2017, ArXiv.

[12] Roman Garnett,et al. Differentially Private Bayesian Optimization , 2015, ICML.

[13] Ruslan Salakhutdinov,et al. On the Quantitative Analysis of Decoder-Based Generative Models , 2016, ICLR.

[14] Jimeng Sun,et al. Generating Multi-label Discrete Electronic Health Records using Generative Adversarial Networks , 2017, ArXiv.

[15] Martín Abadi,et al. Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data , 2016, ICLR.

[16] Toniann Pitassi,et al. Generalization in Adaptive Data Analysis and Holdout Reuse , 2015, NIPS.

[17] Soumith Chintala,et al. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[18] Sarvar Patel,et al. Practical Secure Aggregation for Privacy-Preserving Machine Learning , 2017, IACR Cryptol. ePrint Arch..

[19] Matthias Bethge,et al. A note on the evaluation of generative models , 2015, ICLR.

[20] Pascal Vincent,et al. The Difficulty of Training Deep Architectures and the Effect of Unsupervised Pre-Training , 2009, AISTATS.

[21] Michael Backes,et al. Membership Privacy in MicroRNA-based Studies , 2016, CCS.

[22] Yehuda Lindell,et al. Privacy Preserving Data Mining , 2000, Journal of Cryptology.

[23] Su Ruan,et al. Medical Image Synthesis with Context-Aware Generative Adversarial Networks , 2016, MICCAI.

[24] Thomas Steinke,et al. Robust Traceability from Trace Amounts , 2015, 2015 IEEE 56th Annual Symposium on Foundations of Computer Science.

[25] Ole Winther,et al. Autoencoding beyond pixels using a learned similarity metric , 2015, ICML.

[26] Ananthram Swami,et al. The Limitations of Deep Learning in Adversarial Settings , 2015, 2016 IEEE European Symposium on Security and Privacy (EuroS&P).

[27] Martin J. Wainwright,et al. Privacy Aware Learning , 2012, JACM.

[28] Vitaly Shmatikov,et al. 2011 IEEE Symposium on Security and Privacy “You Might Also Like:” Privacy Risks of Collaborative Filtering , 2022 .

[29] Fan Zhang,et al. Stealing Machine Learning Models via Prediction APIs , 2016, USENIX Security Symposium.

[30] Prateek Mittal,et al. On Your Social Network De-anonymizablity: Quantification and Large Scale Evaluation with Seed Knowledge , 2015, NDSS.

[31] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.

[32] Ian Goodfellow,et al. Deep Learning with Differential Privacy , 2016, CCS.

[33] Wojciech Zaremba,et al. Improved Techniques for Training GANs , 2016, NIPS.

[34] Florian Buettner,et al. Generative Models , 2009, Encyclopedia of Database Systems.

[35] Ananthram Swami,et al. Practical Black-Box Attacks against Machine Learning , 2016, AsiaCCS.

[36] Somesh Jha,et al. Model Inversion Attacks that Exploit Confidence Information and Basic Countermeasures , 2015, CCS.

[37] Minh N. Do,et al. Semantic Image Inpainting with Perceptual and Contextual Losses , 2016, ArXiv.

[38] Pascal Vincent,et al. Generalized Denoising Auto-Encoders as Generative Models , 2013, NIPS.

[39] David Berthelot,et al. BEGAN: Boundary Equilibrium Generative Adversarial Networks , 2017, ArXiv.

[40] S. Nelson,et al. Resolving Individuals Contributing Trace Amounts of DNA to Highly Complex Mixtures Using High-Density SNP Genotyping Microarrays , 2008, PLoS genetics.

[41] Christian Ledig,et al. Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42] Vitaly Shmatikov,et al. De-anonymizing Social Networks , 2009, 2009 30th IEEE Symposium on Security and Privacy.

[43] Yunghsiang Sam Han,et al. Privacy-Preserving Multivariate Statistical Analysis: Linear Regression and Classification , 2004, SDM.