GANobfuscator: Mitigating Information Leakage Under GAN via Differential Privacy

By learning generative models of semantic-rich data distributions from samples, generative adversarial network (GAN) has recently attracted intensive research interests due to its excellent empirical performance as a generative model. The model is used to estimate the underlying distribution of a dataset and randomly generate realistic samples according to their estimated distribution. However, GANs can easily remember training samples due to the high model complexity of deep networks. When GANs are applied to private or sensitive data, the concentration of distribution may divulge some critical information. It consequently requires new technological advances to mitigate the information leakage under GANs. To address this issue, we propose GANobfuscator, a differentially private GAN, which can achieve differential privacy under GANs by adding carefully designed noise to gradients during the learning procedure. With GANobfuscator, analysts are able to generate an unlimited amount of synthetic data for arbitrary analysis tasks without disclosing the privacy of training data. Moreover, we theoretically prove that GANobfuscator can provide strict privacy guarantee with differential privacy. In addition, we develop a gradient-pruning strategy for GANobfuscator to improve the scalability and stability of data training. Through extensive experimental evaluation on benchmark datasets, we demonstrate that GANobfuscator can produce high-quality generated data and retain desirable utility under practical privacy budgets.