Interaction Matters: A Note on Non-asymptotic Local Convergence of Generative Adversarial Networks

Motivated by the pursuit of a systematic computational and algorithmic understanding of Generative Adversarial Networks (GANs), we present a simple yet unified non-asymptotic local convergence theory for smooth two-player games, which subsumes several discrete-time gradient-based saddle point dynamics. The analysis reveals the surprising nature of the off-diagonal interaction term as both a blessing and a curse. On the one hand, this interaction term explains the origin of the slow-down effect in the convergence of Simultaneous Gradient Ascent (SGA) to stable Nash equilibria. On the other hand, for the unstable equilibria, exponential convergence can be proved thanks to the interaction term, for three modified dynamics which have been proposed to stabilize GAN training: Optimistic Mirror Descent (OMD), Consensus Optimization (CO) and Predictive Method (PM). The analysis uncovers the intimate connections among these stabilizing techniques, and provides detailed characterization on the choice of learning rate.

[1]  Yishay Mansour,et al.  Nash Convergence of Gradient Dynamics in General-Sum Games , 2000, UAI.

[2]  Yurii Nesterov,et al.  Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[3]  Oded Galor,et al.  Discrete Dynamical Systems , 2005 .

[4]  Peter L. Bartlett,et al.  Implicit Online Learning , 2010, ICML.

[5]  Karthik Sridharan,et al.  Optimization, Learning, and Games with Predictable Sequences , 2013, NIPS.

[6]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[7]  David Pfau,et al.  Connecting Generative Adversarial Networks and Actor-Critic Methods , 2016, ArXiv.

[8]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[9]  Sebastian Nowozin,et al.  f-GAN: Training Generative Neural Samplers using Variational Divergence Minimization , 2016, NIPS.

[10]  Tengyuan Liang,et al.  How Well Can Generative Adversarial Networks (GAN) Learn Densities: A Nonparametric View , 2017, ArXiv.

[11]  J. Zico Kolter,et al.  Gradient descent GAN optimization is locally stable , 2017, NIPS.

[12]  ASHISH CHERUKURI,et al.  Saddle-Point Dynamics: Conditions for Asymptotic Stability of Saddle Points , 2015, SIAM J. Control. Optim..

[13]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[14]  Sebastian Nowozin,et al.  The Numerics of GANs , 2017, NIPS.

[15]  David Pfau,et al.  Unrolled Generative Adversarial Networks , 2016, ICLR.

[16]  Yi Zhang,et al.  Theoretical limitations of Encoder-Decoder GAN architectures , 2017, ArXiv.

[17]  Kamalika Chaudhuri,et al.  Approximation and Convergence Properties of Generative Adversarial Learning , 2017, NIPS.

[18]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[19]  Yi Zhang,et al.  Do GANs actually learn the distribution? An empirical study , 2017, ArXiv.

[20]  Arthur Gretton,et al.  On gradient regularizers for MMD GANs , 2018, NeurIPS.

[21]  Kamalika Chaudhuri,et al.  The Inductive Bias of Restricted f-GANs , 2018, ArXiv.

[22]  Constantinos Daskalakis,et al.  Training GANs with Optimism , 2017, ICLR.

[23]  Constantinos Daskalakis,et al.  The Limit Points of (Optimistic) Gradient Descent in Min-Max Optimization , 2018, NeurIPS.

[24]  Tengyuan Liang,et al.  On How Well Generative Adversarial Networks Learn Densities: Nonparametric and Parametric Results , 2018, ArXiv.

[25]  Mario Lucic,et al.  Are GANs Created Equal? A Large-Scale Study , 2017, NeurIPS.

[26]  Zheng Xu,et al.  Stabilizing Adversarial Nets With Prediction Methods , 2017, ICLR.

[27]  Constantinos Daskalakis,et al.  Last-Iterate Convergence: Zero-Sum Games and Constrained Min-Max Optimization , 2018, ITCS.