A Closer Look at the Optimization Landscapes of Generative Adversarial Networks

Generative adversarial networks have been very successful in generative modeling, however they remain relatively challenging to train compared to standard deep neural networks. In this paper, we propose new visualization techniques for the optimization landscapes of GANs that enable us to study the game vector field resulting from the concatenation of the gradient of both players. Using these visualization techniques we try to bridge the gap between theory and practice by showing empirically that the training of GANs exhibits significant rotations around Local Stable Stationary Points (LSSP), similar to the one predicted by theory on toy examples. Moreover, we provide empirical evidence that GAN training converge to a stable stationary point which is a saddle point for the generator loss, not a minimum, while still achieving excellent performance.

[1]  Thomas Hofmann,et al.  Local Saddle Point Optimization: A Curvature Exploitation Approach , 2018, AISTATS.

[2]  S. Shankar Sastry,et al.  On Gradient-Based Learning in Continuous Games , 2018, SIAM J. Math. Data Sci..

[3]  Ioannis Mitliagkas,et al.  Negative Momentum for Improved Game Dynamics , 2018, AISTATS.

[4]  Ian J. Goodfellow,et al.  NIPS 2016 Tutorial: Generative Adversarial Networks , 2016, ArXiv.

[5]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[6]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[7]  Fred A. Hamprecht,et al.  Essentially No Barriers in Neural Network Energy Landscape , 2018, ICML.

[8]  F. Verhulst Nonlinear Differential Equations and Dynamical Systems , 1989 .

[9]  Sebastian Nowozin,et al.  Which Training Methods for GANs do actually Converge? , 2018, ICML.

[10]  Surya Ganguli,et al.  Identifying and attacking the saddle point problem in high-dimensional non-convex optimization , 2014, NIPS.

[11]  Constantinos Daskalakis,et al.  Training GANs with Optimism , 2017, ICLR.

[12]  S. Shankar Sastry,et al.  On Finding Local Nash Equilibria (and Only Local Nash Equilibria) in Zero-Sum Games , 2019, 1901.00838.

[13]  Yann Dauphin,et al.  Empirical Analysis of the Hessian of Over-Parametrized Neural Networks , 2017, ICLR.

[14]  Léon Bottou,et al.  Wasserstein Generative Adversarial Networks , 2017, ICML.

[15]  Yann LeCun,et al.  The Loss Surfaces of Multilayer Networks , 2014, AISTATS.

[16]  J. Zico Kolter,et al.  Gradient descent GAN optimization is locally stable , 2017, NIPS.

[17]  G. M. Korpelevich The extragradient method for finding saddle points and other problems , 1976 .

[18]  Kenji Kawaguchi,et al.  Deep Learning without Poor Local Minima , 2016, NIPS.

[19]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[20]  Sebastian Nowozin,et al.  The Numerics of GANs , 2017, NIPS.

[21]  Yuichi Yoshida,et al.  Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.

[22]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[23]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[24]  Yann LeCun,et al.  Eigenvalues of the Hessian in Deep Learning: Singularity and Beyond , 2016, 1611.07476.

[25]  Jeff Donahue,et al.  Large Scale GAN Training for High Fidelity Natural Image Synthesis , 2018, ICLR.

[26]  Gauthier Gidel,et al.  A Variational Inequality Perspective on Generative Adversarial Networks , 2018, ICLR.

[27]  Hao Li,et al.  Visualizing the Loss Landscape of Neural Nets , 2017, NeurIPS.

[28]  Barak A. Pearlmutter Fast Exact Multiplication by the Hessian , 1994, Neural Computation.

[29]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[30]  Oriol Vinyals,et al.  Qualitatively characterizing neural network optimization problems , 2014, ICLR.

[31]  S. Shankar Sastry,et al.  On the Characterization of Local Nash Equilibria in Continuous Games , 2014, IEEE Transactions on Automatic Control.

[32]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[33]  Thore Graepel,et al.  The Mechanics of n-Player Differentiable Games , 2018, ICML.

[34]  Robert Thompson,et al.  The rotating-saddle trap: a mechanical analogy to RF-electric-quadrupole ion trapping? , 2002 .

[35]  Stefan Winkler,et al.  The Unusual Effectiveness of Averaging in GAN Training , 2018, ICLR.

[36]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[37]  Jerry Li,et al.  On the Limitations of First-Order Approximation in GAN Dynamics , 2017, ICML.

[38]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[39]  W. Paul Electromagnetic traps for charged and neutral particles , 1990 .

[40]  Nicolas Le Roux,et al.  Negative eigenvalues of the Hessian in deep neural networks , 2018, ICLR.