A Variational Inequality Perspective on Generative Adversarial Networks

Generative adversarial networks (GANs) form a generative modeling approach known for producing appealing samples, but they are notably difficult to train. One common way to tackle this issue has been to propose new formulations of the GAN objective. Yet, surprisingly few studies have looked at optimization methods designed for this adversarial training. In this work, we cast GAN optimization problems in the general variational inequality framework. Tapping into the mathematical programming literature, we counter some common misconceptions about the difficulties of saddle point optimization and propose to extend techniques designed for variational inequalities to the training of GANs. We apply averaging, extrapolation and a computationally cheaper variant that we call extrapolation from the past to the stochastic gradient method (SGD) and Adam.

[1]  E. Rowland Theory of Games and Economic Behavior , 1946, Nature.

[2]  J. Nash Equilibrium Points in N-Person Games. , 1950, Proceedings of the National Academy of Sciences of the United States of America.

[3]  G. M. Korpelevich The extragradient method for finding saddle points and other problems , 1976 .

[4]  Ronald E. Bruck On the weak convergence of an ergodic iteration for the solution of variational inequalities for monotone operators in Hilbert space , 1977 .

[5]  Kendall E. Atkinson An introduction to numerical analysis , 1978 .

[6]  L. Popov A modification of the Arrow-Hurwicz method for search of saddle points , 1980 .

[7]  Patrick T. Harker,et al.  Finite-dimensional variational inequality and nonlinear complementarity problems: A survey of theory, algorithms and applications , 1990, Math. Program..

[8]  Torbjörn Larsson,et al.  A class of gap functions for variational inequalities , 1994, Math. Program..

[9]  P. Tseng On linear convergence of iterative methods for the variational inequality problem , 1995 .

[10]  R. Tyrrell Rockafellar,et al.  Convergence Rates in Forward-Backward Splitting , 1997, SIAM J. Optim..

[11]  Arkadi Nemirovski,et al.  Prox-Method with Rate of Convergence O(1/t) for Variational Inequalities with Lipschitz Continuous Monotone Operators and Smooth Convex-Concave Saddle Point Problems , 2004, SIAM J. Optim..

[12]  Yurii Nesterov,et al.  Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[13]  G. Crespi,et al.  Minty Variational Inequality and Optimization: Scalar and Vector Case , 2005 .

[14]  Yurii Nesterov,et al.  Dual extrapolation and its applications to solving variational inequalities and related problems , 2003, Math. Program..

[15]  H. Robbins A Stochastic Approximation Method , 1951 .

[16]  A. Juditsky,et al.  Solving variational inequalities with Stochastic Mirror-Prox algorithm , 2008, 0809.0815.

[17]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[18]  Alexander Shapiro,et al.  Stochastic Approximation approach to Stochastic Programming , 2013 .

[19]  Angelia Nedic,et al.  Subgradient Methods for Saddle-Point Problems , 2009, J. Optimization Theory and Applications.

[20]  Mohamed Chtourou,et al.  On the training of recurrent neural networks , 2011, Eighth International Multi-Conference on Systems, Signals & Devices.

[21]  Rong Jin,et al.  25th Annual Conference on Learning Theory Online Optimization with Gradual Variations , 2022 .

[22]  Mark W. Schmidt,et al.  A simpler approach to obtaining an O(1/t) convergence rate for the projected stochastic subgradient method , 2012, ArXiv.

[23]  Karthik Sridharan,et al.  Online Learning with Predictable Sequences , 2012, COLT.

[24]  Geoffrey E. Hinton,et al.  Training Recurrent Neural Networks , 2013 .

[25]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[26]  Angelia Nedic,et al.  Optimal robust smoothing extragradient algorithms for stochastic variational inequality problems , 2014, 53rd IEEE Conference on Decision and Control.

[27]  L. Rosasco,et al.  A Stochastic forward-backward splitting method for solving monotone inclusions in Hilbert spaces , 2014, 1403.7999.

[28]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[29]  L. Rosasco,et al.  A stochastic inertial forward–backward splitting algorithm for multivariate monotone inclusions , 2015, 1507.00848.

[30]  S. Shankar Sastry,et al.  On the Characterization of Local Nash Equilibria in Continuous Games , 2014, IEEE Transactions on Automatic Control.

[31]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[32]  Francis R. Bach,et al.  Stochastic Variance Reduction Methods for Saddle-Point Problems , 2016, NIPS.

[33]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[34]  Sebastian Nowozin,et al.  f-GAN: Training Generative Neural Samplers using Variational Divergence Minimization , 2016, NIPS.

[35]  J. Zico Kolter,et al.  Gradient descent GAN optimization is locally stable , 2017, NIPS.

[36]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[37]  Ian J. Goodfellow,et al.  NIPS 2016 Tutorial: Generative Adversarial Networks , 2016, ArXiv.

[38]  Karan Singh,et al.  Efficient Regret Minimization in Non-Convex Games , 2017, ICML.

[39]  Sebastian Nowozin,et al.  The Numerics of GANs , 2017, NIPS.

[40]  David Pfau,et al.  Unrolled Generative Adversarial Networks , 2016, ICLR.

[41]  Yingyu Liang,et al.  Generalization and Equilibrium in Generative Adversarial Nets (GANs) , 2017, ICML.

[42]  Alfredo N. Iusem,et al.  Extragradient Method with Variance Reduction for Stochastic Variational Inequalities , 2017, SIAM J. Optim..

[43]  Léon Bottou,et al.  Wasserstein Generative Adversarial Networks , 2017, ICML.

[44]  Alexei A. Efros,et al.  Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[45]  Christian Ledig,et al.  Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Kamalika Chaudhuri,et al.  Approximation and Convergence Properties of Generative Adversarial Learning , 2017, NIPS.

[47]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[48]  Richard S. Zemel,et al.  Dualing GANs , 2017, NIPS.

[49]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[50]  Tony Jebara,et al.  Frank-Wolfe Algorithms for Saddle Point Problems , 2016, AISTATS.

[51]  Andreas Krause,et al.  An Online Learning Approach to Generative Adversarial Networks , 2017, ICLR.

[52]  Lars M. Mescheder,et al.  On the convergence properties of GAN training , 2018, ArXiv.

[53]  Andrew M. Dai,et al.  Many Paths to Equilibrium: GANs Do Not Need to Decrease a Divergence At Every Step , 2017, ICLR.

[54]  Constantinos Daskalakis,et al.  Training GANs with Optimism , 2017, ICLR.

[55]  Sebastian Nowozin,et al.  Which Training Methods for GANs do actually Converge? , 2018, ICML.

[56]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[57]  Yuichi Yoshida,et al.  Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.

[58]  Gauthier Gidel,et al.  Parametric Adversarial Divergences are Good Task Losses for Generative Modeling , 2017, ICLR.

[59]  Zheng Xu,et al.  Stabilizing Adversarial Nets With Prediction Methods , 2017, ICLR.

[60]  Chuan-Sheng Foo,et al.  Optimistic mirror descent in saddle-point problems: Going the extra (gradient) mile , 2018, ICLR.

[61]  Ioannis Mitliagkas,et al.  Negative Momentum for Improved Game Dynamics , 2018, AISTATS.

[62]  Stefan Winkler,et al.  The Unusual Effectiveness of Averaging in GAN Training , 2018, ICLR.