Stabilizing Training of Generative Adversarial Networks through Regularization

Deep generative models based on Generative Adversarial Networks (GANs) have demonstrated impressive sample quality but in order to work they require a careful choice of architecture, parameter initialization, and selection of hyper-parameters. This fragility is in part due to a dimensional mismatch or non-overlapping support between the model distribution and the data distribution, causing their density ratio and the associated f -divergence to be undefined. We overcome this fundamental limitation and propose a new regularization approach with low computational cost that yields a stable GAN training procedure. We demonstrate the effectiveness of this regularizer accross several architectures trained on common benchmark image generation tasks. Our regularization turns GAN models into reliable building blocks for deep learning.

[1]  Léon Bottou,et al.  Towards Principled Methods for Training Generative Adversarial Networks , 2017, ICLR.

[2]  Yoshua Bengio,et al.  Mode Regularized Generative Adversarial Networks , 2016, ICLR.

[3]  Guozhong An,et al.  The Effects of Adding Noise During Backpropagation Training on a Generalization Performance , 1996, Neural Computation.

[4]  Kenji Fukumizu,et al.  On integral probability metrics, φ-divergences and binary classification , 2009, 0901.2698.

[5]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[6]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[7]  Léon Bottou,et al.  Wasserstein Generative Adversarial Networks , 2017, ICML.

[8]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[9]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[10]  Sebastian Nowozin,et al.  DISCO Nets : DISsimilarity COefficients Networks , 2016, NIPS.

[11]  David Pfau,et al.  Unrolled Generative Adversarial Networks , 2016, ICLR.

[12]  Thomas P. Minka,et al.  Divergence measures and message passing , 2005 .

[13]  Mark D. Reid,et al.  Information, Divergence and Risk for Binary Experiments , 2009, J. Mach. Learn. Res..

[14]  Hariharan Narayanan,et al.  Sample Complexity of Testing the Manifold Hypothesis , 2010, NIPS.

[15]  Lucas Theis,et al.  Amortised MAP Inference for Image Super-resolution , 2016, ICLR.

[16]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Martin J. Wainwright,et al.  Estimating Divergence Functionals and the Likelihood Ratio by Convex Risk Minimization , 2008, IEEE Transactions on Information Theory.

[18]  Yoshua Bengio,et al.  Generative Adversarial Networks , 2014, ArXiv.

[19]  A. Müller Integral Probability Metrics and Their Generating Classes of Functions , 1997, Advances in Applied Probability.

[20]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[21]  Sebastian Nowozin,et al.  The Numerics of GANs , 2017, NIPS.

[22]  Christopher M. Bishop,et al.  Current address: Microsoft Research, , 2022 .

[23]  Bernhard Schölkopf,et al.  A Kernel Two-Sample Test , 2012, J. Mach. Learn. Res..

[24]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[25]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[26]  D. W. Scott,et al.  Multivariate Density Estimation, Theory, Practice and Visualization , 1992 .

[27]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[28]  Shun-ichi Amari,et al.  Methods of information geometry , 2000 .

[29]  Richard S. Zemel,et al.  Generative Moment Matching Networks , 2015, ICML.

[30]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[31]  Yinda Zhang,et al.  LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop , 2015, ArXiv.

[32]  Sebastian Nowozin,et al.  f-GAN: Training Generative Neural Samplers using Variational Divergence Minimization , 2016, NIPS.