PPGAN: Privacy-Preserving Generative Adversarial Network

Generative Adversarial Network (GAN) and its variants serve as a perfect representation of the data generation model, providing researchers with a large amount of high-quality generated data. They illustrate a promising direction for research with limited data availability. When GAN learns the semantic-rich data distribution from a dataset, the density of the generated distribution tends to concentrate on the training data. Due to the gradient parameters of the deep neural network contain the data distribution of the training samples, they can easily remember the training samples. When GAN is applied to private or sensitive data, for instance, patient medical records, as private information may be leakage. To address this issue, we propose a Privacy-preserving Generative Adversarial Network (PPGAN) model, in which we achieve differential privacy in GANs by adding well-designed noise to the gradient during the model learning procedure. Besides, we introduced the Moments Accountant strategy in the PPGAN training process to improve the stability and compatibility of the model by controlling privacy loss. We also give a mathematical proof of the differential privacy discriminator. Through extensive case studies of the benchmark datasets, we demonstrate that PPGAN can generate high-quality synthetic data while retaining the required data available under a reasonable privacy budget.

[1]  Ziming Zhao,et al.  User Electricity Behavior Analysis Based on K-Means Plus Clustering Algorithm , 2017, 2017 International Conference on Computer Technology, Electronics and Communication (ICCTEC).

[2]  Mihaela van der Schaar,et al.  PATE-GAN: Generating Synthetic Data with Differential Privacy Guarantees , 2018, ICLR.

[3]  Jian Weng,et al.  Enabling Secure and Fast Indexing for Privacy-Assured Healthcare Monitoring via Compressive Sensing , 2016, IEEE Transactions on Multimedia.

[4]  Timo Aila,et al.  A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Mohsen Guizani,et al.  Reliable Federated Learning for Mobile Networks , 2019, IEEE Wireless Communications.

[6]  Kudakwashe Dube,et al.  Using the CareMap with Health Incidents Statistics for Generating the Realistic Synthetic Electronic Healthcare Record , 2016, 2016 IEEE International Conference on Healthcare Informatics (ICHI).

[7]  Anand D. Sarwate,et al.  Stochastic gradient descent with differentially private updates , 2013, 2013 IEEE Global Conference on Signal and Information Processing.

[8]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[9]  Léon Bottou,et al.  Wasserstein GAN , 2017, ArXiv.

[10]  Fei Wang,et al.  Differentially Private Generative Adversarial Network , 2018, ArXiv.

[11]  Aaron Roth,et al.  Gaussian differential privacy , 2019, Journal of the Royal Statistical Society: Series B (Statistical Methodology).

[12]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[13]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[14]  Martín Abadi,et al.  Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data , 2016, ICLR.

[15]  Pascal Van Hentenryck,et al.  Privacy-Preserving Federated Data Sharing , 2019, AAMAS.

[16]  Vitaly Shmatikov,et al.  Privacy-preserving deep learning , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[17]  Paul Voigt,et al.  The EU General Data Protection Regulation (GDPR) , 2017 .

[18]  Peter Szolovits,et al.  MIMIC-III, a freely accessible critical care database , 2016, Scientific Data.

[19]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[20]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[21]  Zhiwei Steven Wu,et al.  Privacy-Preserving Generative Deep Neural Networks Support Clinical Data Sharing , 2017, bioRxiv.

[22]  Pascal Van Hentenryck,et al.  Differential Privacy of Hierarchical Census Data: An Optimization Approach , 2019, CP.

[23]  Giuseppe Ateniese,et al.  Deep Models Under the GAN: Information Leakage from Collaborative Deep Learning , 2017, CCS.

[24]  Jimeng Sun,et al.  Generating Multi-label Discrete Patient Records using Generative Adversarial Networks , 2017, MLHC.