The Numerics of GANs

In this paper, we analyze the numerics of common algorithms for training Generative Adversarial Networks (GANs). Using the formalism of smooth two-player games we analyze the associated gradient vector field of GAN training objectives. Our findings suggest that the convergence of current algorithms suffers due to two factors: i) presence of eigenvalues of the Jacobian of the gradient vector field with zero real-part, and ii) eigenvalues with big imaginary part. Using these findings, we design a new algorithm that overcomes some of these limitations and has better convergence properties. Experimentally, we demonstrate its superiority on training common GAN architectures and show convergence on GAN architectures that are known to be notoriously hard to train.

[1]  Trevor Darrell,et al.  Adversarial Feature Learning , 2016, ICLR.

[2]  Minh N. Do,et al.  Semantic Image Inpainting with Perceptual and Contextual Losses , 2016, ArXiv.

[3]  Lucas Theis,et al.  Amortised MAP Inference for Image Super-resolution , 2016, ICLR.

[4]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[5]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[6]  Sebastian Nowozin,et al.  f-GAN: Training Generative Neural Samplers using Variational Divergence Minimization , 2016, NIPS.

[7]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[9]  Jonathon Shlens,et al.  Conditional Image Synthesis with Auxiliary Classifier GANs , 2016, ICML.

[10]  Trevor Darrell,et al.  Adversarial Discriminative Domain Adaptation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  David Pfau,et al.  Unrolled Generative Adversarial Networks , 2016, ICLR.

[12]  Aaron C. Courville,et al.  Adversarially Learned Inference , 2016, ICLR.

[13]  Léon Bottou,et al.  Towards Principled Methods for Training Generative Adversarial Networks , 2017, ICLR.

[14]  Razvan Pascanu,et al.  Natural Gradient Revisited , 2013, ICLR.

[15]  Shun-ichi Amari,et al.  Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.

[16]  David Pfau,et al.  Connecting Generative Adversarial Networks and Actor-Critic Methods , 2016, ArXiv.

[17]  J. Lambert Numerical Methods for Ordinary Differential Equations , 1991 .

[18]  Sebastian Nowozin,et al.  Adversarial Variational Bayes: Unifying Variational Autoencoders and Generative Adversarial Networks , 2017, ICML.

[19]  Dimitri P. Bertsekas,et al.  Nonlinear Programming , 1997 .

[20]  Yingyu Liang,et al.  Generalization and Equilibrium in Generative Adversarial Nets (GANs) , 2017, ICML.

[21]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[22]  Christian Ledig,et al.  Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[24]  Dimitri P. Bertsekas,et al.  Constrained Optimization and Lagrange Multiplier Methods , 1982 .

[25]  Minh N. Do,et al.  Semantic Image Inpainting with Deep Generative Models , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  S. Shankar Sastry,et al.  Characterization and computation of local Nash equilibria in continuous games , 2013, 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton).