论文信息 - Convergence and Sample Complexity of SGD in GANs - 字舞流文

Convergence and Sample Complexity of SGD in GANs

We provide theoretical convergence guarantees on training Generative Adversarial Networks (GANs) via SGD. We consider learning a target distribution modeled by a 1-layer Generator network with a non-linear activation function $\phi(\cdot)$ parametrized by a $d \times d$ weight matrix $\mathbf W_*$, i.e., $f_*(\mathbf x) = \phi(\mathbf W_* \mathbf x)$. Our main result is that by training the Generator together with a Discriminator according to the Stochastic Gradient Descent-Ascent iteration proposed by Goodfellow et al. yields a Generator distribution that approaches the target distribution of $f_*$. Specifically, we can learn the target distribution within total-variation distance $\epsilon$ using $\tilde O(d^2/\epsilon^2)$ samples which is (near-)information theoretically optimal. Our results apply to a broad class of non-linear activation functions $\phi$, including ReLUs and is enabled by a connection with truncated statistics and an appropriate design of the Discriminator network. Our approach relies on a bilevel optimization framework to show that vanilla SGDA works.

Christos Tzamos | Sihan Liu | Vasilis Kontonis | Christos Tzamos | Vasilis Kontonis | Sihan Liu

[1] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[2] Constantinos Daskalakis,et al. The Limit Points of (Optimistic) Gradient Descent in Min-Max Optimization , 2018, NeurIPS.

[3] T. Sanders,et al. Analysis of Boolean Functions , 2012, ArXiv.

[4] Ryan O'Donnell,et al. Analysis of Boolean Functions , 2014, ArXiv.

[5] Sebastian Nowozin,et al. Which Training Methods for GANs do actually Converge? , 2018, ICML.

[6] Mingrui Liu,et al. Non-Convex Min-Max Optimization: Provable Algorithms and Applications in Machine Learning , 2018, ArXiv.

[7] Christos Tzamos,et al. Efficient Statistics, in High Dimensions, from Truncated Samples , 2018, 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS).

[8] Constantinos Daskalakis,et al. Last-Iterate Convergence: Zero-Sum Games and Constrained Min-Max Optimization , 2018, ITCS.

[9] Lin Yang,et al. Photographic Text-to-Image Synthesis with a Hierarchically-Nested Adversarial Network , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[10] Georgios Piliouras,et al. Poincaré Recurrence, Cycles and Spurious Equilibria in Gradient-Descent-Ascent for Non-Convex Non-Concave Zero-Sum Games , 2019, NeurIPS.

[11] J. Zico Kolter,et al. Gradient descent GAN optimization is locally stable , 2017, NIPS.

[12] Sridhar Mahadevan,et al. Global Convergence to the Equilibrium of GANs using Variational Inequalities , 2018, ArXiv.

[13] Saeed Ghadimi,et al. Mini-batch stochastic approximation methods for nonconvex stochastic composite optimization , 2013, Mathematical Programming.

[14] Taesung Park,et al. CyCADA: Cycle-Consistent Adversarial Domain Adaptation , 2017, ICML.

[15] Léon Bottou,et al. Towards Principled Methods for Training Generative Adversarial Networks , 2017, ICLR.

[16] A. Carbery,et al. Distributional and L-q norm inequalities for polynomials over convex bodies in R-n , 2001 .

[17] Hao Wang,et al. Unsupervised Graph Representation Learning With Variable Heat Kernel , 2020, IEEE Access.

[18] Nicholas J. A. Harvey,et al. Simple and optimal high-probability bounds for strongly-convex stochastic gradient descent , 2019, ArXiv.

[19] Kamalika Chaudhuri,et al. Approximation and Convergence Properties of Generative Adversarial Learning , 2017, NIPS.

[20] Nicholas J. A. Harvey,et al. Tight Analyses for Non-Smooth Stochastic Gradient Descent , 2018, COLT.

[21] Yingyu Liang,et al. Generalization and Equilibrium in Generative Adversarial Nets (GANs) , 2017, ICML.

[22] Michael I. Jordan,et al. On Gradient Descent Ascent for Nonconvex-Concave Minimax Problems , 2019, ICML.

[23] David Pfau,et al. Unrolled Generative Adversarial Networks , 2016, ICLR.

[24] Fei Xia,et al. Understanding GANs: the LQG Setting , 2017, ArXiv.

[25] Yurii Nesterov,et al. Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[26] Soumith Chintala,et al. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[27] J. Danskin. The Theory of Max-Min and its Application to Weapons Allocation Problems , 1967 .

[28] Jacob D. Abernethy,et al. How to Train Your DRAGAN , 2017, ArXiv.

[29] Alexandros G. Dimakis,et al. SGD Learns One-Layer Networks in WGANs , 2019, ICML.