论文信息 - Stackelberg GAN: Towards Provable Minimax Equilibrium via Multi-Generator Architectures - 字舞流文

Stackelberg GAN: Towards Provable Minimax Equilibrium via Multi-Generator Architectures

We study the problem of alleviating the instability issue in the GAN training procedure via new architecture design. The discrepancy between the minimax and maximin objective values could serve as a proxy for the difficulties that the alternating gradient descent encounters in the optimization of GANs. In this work, we give new results on the benefits of multi-generator architecture of GANs. We show that the minimax gap shrinks to $\epsilon$ as the number of generators increases with rate $\widetilde{O}(1/\epsilon)$. This improves over the best-known result of $\widetilde{O}(1/\epsilon^2)$. At the core of our techniques is a novel application of Shapley-Folkman lemma to the generic minimax problem, where in the literature the technique was only known to work when the objective function is restricted to the Lagrangian function of a constraint optimization problem. Our proposed Stackelberg GAN performs well experimentally in both synthetic and real-world datasets, improving Fr\'echet Inception Distance by $14.61\%$ over the previous multi-generator GANs on the benchmark datasets.

Pengtao Xie | Eric P. Xing | Ruslan Salakhutdinov | Jiantao Jiao | Susu Xu | Hongyang Zhang | R. Salakhutdinov | E. Xing | P. Xie | Jiantao Jiao | Susu Xu | Hongyang Zhang

[1] Hugo Larochelle,et al. Efficient Learning of Deep Boltzmann Machines , 2010, AISTATS.

[2] Yingyu Liang,et al. Generalization and Equilibrium in Generative Adversarial Nets (GANs) , 2017, ICML.

[3] Trung Le,et al. MGAN: Training Generative Adversarial Nets with Multiple Generators , 2018, ICLR.

[4] Bo An,et al. Stackelberg Security Games: Looking Beyond a Decade of Success , 2018, IJCAI.

[5] David Berthelot,et al. BEGAN: Boundary Equilibrium Generative Adversarial Networks , 2017, ArXiv.

[6] Yi Zhang,et al. Do GANs learn the distribution? Some Theory and Empirics , 2018, ICLR.

[7] Ian J. Goodfellow,et al. NIPS 2016 Tutorial: Generative Adversarial Networks , 2016, ArXiv.

[8] Frank Hutter,et al. A Downsampled Variant of ImageNet as an Alternative to the CIFAR datasets , 2017, ArXiv.

[9] Sepp Hochreiter,et al. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[10] D. Bertsekas. Min Common / Max Crossing Duality : A Geometric View of Conjugacy in Convex Optimization , 2008 .

[11] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.

[12] Yiannis Demiris,et al. MAGAN: Margin Adaptation for Generative Adversarial Networks , 2017, ArXiv.

[13] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[14] R. Starr. Quasi-Equilibria in Markets with Non-Convex Preferences , 1969 .

[15] David P. Woodruff,et al. Matrix Completion and Related Problems via Strong Duality , 2017, ITCS.

[16] Sebastian Nowozin,et al. f-GAN: Training Generative Neural Samplers using Variational Divergence Minimization , 2016, NIPS.

[17] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .

[18] Trung Le,et al. Dual Discriminator Generative Adversarial Nets , 2017, NIPS.

[19] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[20] Wojciech Zaremba,et al. Improved Techniques for Training GANs , 2016, NIPS.

[21] David A. Wagner,et al. Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples , 2018, ICML.

[22] Léon Bottou,et al. Wasserstein GAN , 2017, ArXiv.

[23] Aaron C. Courville,et al. Adversarially Learned Inference , 2016, ICLR.

[24] Sridhar Mahadevan,et al. Generative Multi-Adversarial Networks , 2016, ICLR.

[25] Rama Chellappa,et al. Defense-GAN: Protecting Classifiers Against Adversarial Attacks Using Generative Models , 2018, ICLR.

[26] Soumith Chintala,et al. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[27] Philip H. S. Torr,et al. Multi-agent Diverse Generative Adversarial Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[28] Hongyang Zhang,et al. Deep Neural Networks with Multi-Branch Architectures Are Less Non-Convex , 2018, ArXiv.

[29] Mingyan Liu,et al. Generating Adversarial Examples with Adversarial Networks , 2018, IJCAI.