论文信息 - On Characterizing GAN Convergence Through Proximal Duality Gap - 字舞流文

On Characterizing GAN Convergence Through Proximal Duality Gap

Despite the accomplishments of Generative Adversarial Networks (GANs) in modeling data distributions, training them remains a challenging task. A contributing factor to this difficulty is the non-intuitive nature of the GAN loss curves, which necessitates a subjective evaluation of the generated output to infer training progress. Recently, motivated by game theory, duality gap has been proposed as a domain agnostic measure to monitor GAN training. However, it is restricted to the setting when the GAN converges to a Nash equilibrium. But GANs need not always converge to a Nash equilibrium to model the data distribution. In this work, we extend the notion of duality gap to proximal duality gap that is applicable to the general context of training GANs where Nash equilibria may not exist. We show theoretically that the proximal duality gap is capable of monitoring the convergence of GANs to a wider spectrum of equilibria that subsumes Nash equilibria. We also theoretically establish the relationship between the proximal duality gap and the divergence between the real and generated data distributions for different GAN formulations. Our results provide new insights into the nature of GAN convergence. Finally, we validate experimentally the usefulness of proximal duality gap for monitoring and influencing GAN training.

Narayanan C. Krishnan | Aroof Aimen | Sahil Sidheekh | N. C. Krishnan | Aroof Aimen | Sahil Sidheekh

[1] Yuichi Yoshida,et al. Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.

[2] David Tse,et al. A Convex Duality Framework for GANs , 2018, NeurIPS.

[3] Michael I. Jordan,et al. On Gradient Descent Ascent for Nonconvex-Concave Minimax Problems , 2019, ICML.

[4] J. Nash. Equilibrium Points in N-Person Games. , 1950, Proceedings of the National Academy of Sciences of the United States of America.

[5] Han Zhang,et al. Self-Attention Generative Adversarial Networks , 2018, ICML.

[6] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[7] Lillian J. Ratliff,et al. Convergence of Learning Dynamics in Stackelberg Games , 2019, ArXiv.

[8] L. Deng,et al. The MNIST Database of Handwritten Digit Images for Machine Learning Research [Best of the Web] , 2012, IEEE Signal Processing Magazine.

[9] Léon Bottou,et al. Wasserstein Generative Adversarial Networks , 2017, ICML.

[10] Mario Lucic,et al. Are GANs Created Equal? A Large-Scale Study , 2017, NeurIPS.

[11] Sepp Hochreiter,et al. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[12] Soumith Chintala,et al. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[13] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[14] Ali Borji,et al. Pros and Cons of GAN Evaluation Measures , 2018, Comput. Vis. Image Underst..

[15] Pascal Vincent,et al. A Closer Look at the Optimization Landscapes of Generative Adversarial Networks , 2019, ICLR.

[16] Jason D. Lee,et al. Solving a Class of Non-Convex Min-Max Games Using Iterative First Order Methods , 2019, NeurIPS.

[17] Ian J. Goodfellow,et al. Skill Rating for Generative Models , 2018, ArXiv.

[18] Jeff Donahue,et al. Large Scale GAN Training for High Fidelity Natural Image Synthesis , 2018, ICLR.

[19] Matthias Bethge,et al. A note on the evaluation of generative models , 2015, ICLR.

[20] Lillian J. Ratliff,et al. On the Convergence of Competitive, Multi-Agent Gradient-Based Learning , 2018, ArXiv.

[21] Bernhard Schölkopf,et al. AdaGAN: Boosting Generative Models , 2017, NIPS.

[22] Aryan Mokhtari,et al. A Unified Analysis of Extra-gradient and Optimistic Gradient Methods for Saddle Point Problems: Proximal Point Approach , 2019, AISTATS.

[23] Sebastian Nowozin,et al. f-GAN: Training Generative Neural Samplers using Variational Divergence Minimization , 2016, NIPS.

[24] Avik Pal,et al. TorchGAN: A Flexible Framework for GAN Training and Evaluation , 2019, J. Open Source Softw..

[25] Aaron C. Courville,et al. Improved Training of Wasserstein GANs , 2017, NIPS.

[26] Xiaogang Wang,et al. Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[27] Tamer Basar,et al. Policy Optimization Provably Converges to Nash Equilibria in Zero-Sum Linear Quadratic Games , 2019, NeurIPS.

[28] Seong Joon Oh,et al. Reliable Fidelity and Diversity Metrics for Generative Models , 2020, ICML.

[29] Aurélien Lucchi,et al. A Domain Agnostic Measure for Monitoring and Evaluating GANs , 2019, NeurIPS.

[30] Constantinos Daskalakis,et al. Training GANs with Optimism , 2017, ICLR.

[31] Thomas Hofmann,et al. Local Saddle Point Optimization: A Curvature Exploitation Approach , 2018, AISTATS.

[32] Yingyu Liang,et al. Generalization and Equilibrium in Generative Adversarial Nets (GANs) , 2017, ICML.

[33] S. Shankar Sastry,et al. On Finding Local Nash Equilibria (and Only Local Nash Equilibria) in Zero-Sum Games , 2019, 1901.00838.

[34] Asuman Ozdaglar,et al. Do GANs always have Nash equilibria? , 2020, ICML.

[35] Narayanan C. Krishnan,et al. On Duality Gap as a Measure for Monitoring GAN Training , 2020, 2021 International Joint Conference on Neural Networks (IJCNN).

[36] Wojciech Zaremba,et al. Improved Techniques for Training GANs , 2016, NIPS.

[37] Volkan Cevher,et al. Finding Mixed Nash Equilibria of Generative Adversarial Networks , 2018, ICML.

[38] Martin J. Wainwright,et al. Estimating Divergence Functionals and the Likelihood Ratio by Convex Risk Minimization , 2008, IEEE Transactions on Information Theory.

[39] Michael I. Jordan,et al. What is Local Optimality in Nonconvex-Nonconcave Minimax Optimization? , 2019, ICML.

[40] Olivier Bachem,et al. Assessing Generative Models via Precision and Recall , 2018, NeurIPS.

[41] Jaakko Lehtinen,et al. Improved Precision and Recall Metric for Assessing Generative Models , 2019, NeurIPS.