On the Privacy Properties of GAN-generated Samples

The privacy implications of generative adversarial networks (GANs) are a topic of great interest, leading to several recent algorithms for training GANs with privacy guarantees. By drawing connections to the generalization properties of GANs, we prove that under some assumptions, GAN-generated samples inherently satisfy some (weak) privacy guarantees. First, we show that if a GAN is trained on m samples and used to generate n samples, the generated samples are ( , δ)differentially-private for ( , δ) pairs where δ scales as O(n/m). We show that under some special conditions, this upper bound is tight. Next, we study the robustness of GANgenerated samples to membership inference attacks. We model membership inference as a hypothesis test in which the adversary must determine whether a given sample was drawn from the training dataset or from the underlying data distribution. We show that this adversary can achieve an area under the ROC curve that scales no better than O(m−1/4).

[1]  G. Fanti,et al.  Using GANs for Sharing Networked Time Series Data: Challenges, Initial Promise, and Open Questions , 2019, Internet Measurement Conference.

[2]  Yuichi Yoshida,et al.  Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.

[3]  Léon Bottou,et al.  Towards Principled Methods for Training Generative Adversarial Networks , 2017, ICLR.

[4]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[5]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[6]  Zhuolin Yang,et al.  Scalable Differentially Private Generative Student Model via PATE , 2019, ArXiv.

[7]  David Duvenaud,et al.  Invertible Residual Networks , 2018, ICML.

[8]  Mario Fritz,et al.  GAN-Leaks: A Taxonomy of Membership Inference Attacks against GANs , 2019, ArXiv.

[9]  Koray Kavukcuoglu,et al.  Pixel Recurrent Neural Networks , 2016, ICML.

[10]  Vitaly Shmatikov,et al.  Membership Inference Attacks Against Machine Learning Models , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[11]  Raef Bassily,et al.  Algorithmic stability for adaptive data analysis , 2015, STOC.

[12]  Vitaly Shmatikov,et al.  Exploiting Unintended Feature Leakage in Collaborative Learning , 2018, 2019 IEEE Symposium on Security and Privacy (SP).

[13]  Kai Chen,et al.  Understanding Membership Inferences on Well-Generalized Learning Models , 2018, ArXiv.

[14]  A. Müller Integral Probability Metrics and Their Generating Classes of Functions , 1997, Advances in Applied Probability.

[15]  Emiliano De Cristofaro,et al.  LOGAN: Membership Inference Attacks Against Generative Models , 2017, Proc. Priv. Enhancing Technol..

[16]  Alexandre B. Tsybakov,et al.  Introduction to Nonparametric Estimation , 2008, Springer series in statistics.

[17]  Sebastian Meiser,et al.  Approximate and Probabilistic Differential Privacy Definitions , 2018, IACR Cryptol. ePrint Arch..

[18]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[19]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[20]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[21]  Cynthia Dwork,et al.  Differential Privacy: A Survey of Results , 2008, TAMC.

[22]  Ashish Khetan,et al.  PacGAN: The Power of Two Samples in Generative Adversarial Networks , 2017, IEEE Journal on Selected Areas in Information Theory.

[23]  Yu Bai,et al.  Approximability of Discriminators Implies Diversity in GANs , 2018, ICLR.

[24]  Jeff Donahue,et al.  Large Scale GAN Training for High Fidelity Natural Image Synthesis , 2018, ICLR.

[25]  Sebastian Nowozin,et al.  f-GAN: Training Generative Neural Samplers using Variational Divergence Minimization , 2016, NIPS.

[26]  Daniel Bernau,et al.  Monte Carlo and Reconstruction Membership Inference Attacks against Generative Models , 2019, Proc. Priv. Enhancing Technol..

[27]  Somesh Jha,et al.  Privacy Risk in Machine Learning: Analyzing the Connection to Overfitting , 2017, 2018 IEEE 31st Computer Security Foundations Symposium (CSF).

[28]  Samy Bengio,et al.  Density estimation using Real NVP , 2016, ICLR.

[29]  Mohamed Ali Kaafar,et al.  Modelling and Quantifying Membership Information Leakage in Machine Learning , 2020, ArXiv.

[30]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[31]  Gilles Barthe,et al.  Privacy Amplification by Subsampling: Tight Analyses via Couplings and Divergences , 2018, NeurIPS.

[32]  Pramod Viswanath,et al.  The Composition Theorem for Differential Privacy , 2013, IEEE Transactions on Information Theory.

[33]  Kobbi Nissim,et al.  On the Generalization Properties of Differential Privacy , 2015, ArXiv.

[34]  Mihaela van der Schaar,et al.  PATE-GAN: Generating Synthetic Data with Differential Privacy Guarantees , 2018, ICLR.

[35]  Julien Rabin,et al.  Detecting Overfitting of Deep Generative Networks via Latent Recovery , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Jimeng Sun,et al.  Generating Multi-label Discrete Patient Records using Generative Adversarial Networks , 2017, MLHC.

[37]  Vyas Sekar,et al.  Why Spectral Normalization Stabilizes GANs: Analysis and Improvements , 2020, ArXiv.

[38]  Yang Zhang,et al.  Label-Leaks: Membership Inference Attack with Label , 2020, ArXiv.

[39]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[40]  Gunnar Rätsch,et al.  Real-valued (Medical) Time Series Generation with Recurrent Conditional GANs , 2017, ArXiv.

[41]  Jun Zhou,et al.  Generalization in Generative Adversarial Networks: A Novel Perspective from Privacy Protection , 2019, NeurIPS.

[42]  Cordelia Schmid,et al.  White-box vs Black-box: Bayes Optimal Strategies for Membership Inference , 2019, ICML.

[43]  Mario Fritz,et al.  ML-Leaks: Model and Data Independent Membership Inference Attacks and Defenses on Machine Learning Models , 2018, NDSS.

[44]  Aaron Roth,et al.  Adaptive Learning with Robust Generalization Guarantees , 2016, COLT.

[45]  Stephen E. Fienberg,et al.  Learning with Differential Privacy: Stability, Learnability and the Sufficiency and Necessity of ERM Principle , 2015, J. Mach. Learn. Res..

[46]  Toniann Pitassi,et al.  Preserving Statistical Validity in Adaptive Data Analysis , 2014, STOC.