Single-level Adversarial Data Synthesis based on Neural Tangent Kernels

Generative adversarial networks (GANs) have achieved impressive performance in data synthesis and have driven the development of many applications. However, GANs are known to be hard to train due to their bilevel objective, which leads to the problems of convergence, mode collapse, and gradient vanishing. In this paper, we propose a new generative model called the generative adversarial NTK (GA-NTK) that has a single-level objective. The GA-NTK keeps the spirit of adversarial learning (which helps generate plausible data) while avoiding the training difficulties of GANs. This is done by modeling the discriminator as a Gaussian process with a neural tangent kernel (NTK-GP) whose training dynamics can be completely described by a closed-form formula. We analyze the convergence behavior of GA-NTK trained by gradient descent and give some sufficient conditions for convergence. We also conduct extensive experiments to study the advantages and limitations of GA-NTK and propose some techniques that make GA-NTK more practical.

[1]  Jinwoo Shin,et al.  Scaling Neural Tangent Kernels via Sketching and Random Features , 2021, NeurIPS.

[2]  Emmanuel de B'ezenac,et al.  A Neural Tangent Kernel Perspective of GANs , 2021, ICML.

[3]  Jinwoo Shin,et al.  Random Features for the Neural Tangent Kernel , 2021, ArXiv.

[4]  Xiaohui Cui,et al.  Tackling mode collapse in multi-generator GANs with orthogonal vectors , 2021, Pattern Recognit..

[5]  Jaehoon Lee,et al.  Finite Versus Infinite Neural Networks: an Empirical Study , 2020, NeurIPS.

[6]  R. Basri,et al.  On the Similarity between the Laplace and Neural Tangent Kernels , 2020, NeurIPS.

[7]  Zichao Wang,et al.  The Recurrent Neural Tangent Kernel , 2020, ICLR.

[8]  Jascha Sohl-Dickstein,et al.  Infinite attention: NNGP and NTK for deep attention networks , 2020, ICML.

[9]  Jonathan Ragan-Kelley,et al.  Neural Kernels Without Tangents , 2020, ICML.

[10]  Jaehoon Lee,et al.  Neural Tangents: Fast and Easy Infinite Neural Networks in Python , 2019, ICLR.

[11]  Tero Karras,et al.  Analyzing and Improving the Image Quality of StyleGAN , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Greg Yang,et al.  Tensor Programs I: Wide Feedforward or Recurrent Neural Networks of Any Architecture are Gaussian Processes , 2019, NeurIPS.

[13]  Ruosong Wang,et al.  Harnessing the Power of Infinitely Wide Deep Nets on Small-data Tasks , 2019, ICLR.

[14]  Prateek Jain,et al.  Efficient Algorithms for Smooth Minimax Optimization , 2019, NeurIPS.

[15]  Julien Mairal,et al.  On the Inductive Bias of Neural Tangent Kernels , 2019, NeurIPS.

[16]  Aleksander Madry,et al.  Adversarial Examples Are Not Bugs, They Are Features , 2019, NeurIPS.

[17]  Ruosong Wang,et al.  On Exact Computation with an Infinitely Wide Neural Net , 2019, NeurIPS.

[18]  Siwei Ma,et al.  Mode Seeking Generative Adversarial Networks for Diverse Image Synthesis , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Jaehoon Lee,et al.  Wide neural networks of any depth evolve as linear models under gradient descent , 2019, NeurIPS.

[20]  Greg Yang,et al.  Scaling Limits of Wide Neural Networks with Weight Sharing: Gaussian Process Behavior, Gradient Independence, and Neural Tangent Kernel Derivation , 2019, ArXiv.

[21]  Aryan Mokhtari,et al.  A Unified Analysis of Extra-gradient and Optimistic Gradient Methods for Saddle Point Problems: Proximal Point Approach , 2019, AISTATS.

[22]  Francis Bach,et al.  On Lazy Training in Differentiable Programming , 2018, NeurIPS.

[23]  Jaehoon Lee,et al.  Bayesian Deep Convolutional Networks with Many Channels are Gaussian Processes , 2018, ICLR.

[24]  Cem Anil,et al.  Sorting out Lipschitz function approximation , 2018, ICML.

[25]  Jeff Donahue,et al.  Large Scale GAN Training for High Fidelity Natural Image Synthesis , 2018, ICLR.

[26]  Laurence Aitchison,et al.  Deep Convolutional Networks as shallow Gaussian Processes , 2018, ICLR.

[27]  Carsten Steger,et al.  Improving Unsupervised Defect Segmentation by Applying Structural Similarity to Autoencoders , 2018, VISIGRAPP.

[28]  Arthur Jacot,et al.  Neural tangent kernel: convergence and generalization in neural networks (invited paper) , 2018, NeurIPS.

[29]  Hyunjung Shim,et al.  MGGAN: Solving Mode Collapse Using Manifold-Guided Training , 2018, 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW).

[30]  Richard E. Turner,et al.  Gaussian Process Behaviour in Wide Deep Neural Networks , 2018, ICLR.

[31]  Yuichi Yoshida,et al.  Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.

[32]  Bernhard Schölkopf,et al.  Tempered Adversarial Networks , 2018, ICML.

[33]  Sebastian Nowozin,et al.  Which Training Methods for GANs do actually Converge? , 2018, ICML.

[34]  Jan Kautz,et al.  High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[35]  Mario Lucic,et al.  Are GANs Created Equal? A Large-Scale Study , 2017, NeurIPS.

[36]  Jaehoon Lee,et al.  Deep Neural Networks as Gaussian Processes , 2017, ICLR.

[37]  Constantinos Daskalakis,et al.  Training GANs with Optimism , 2017, ICLR.

[38]  Léon Bottou,et al.  Wasserstein Generative Adversarial Networks , 2017, ICML.

[39]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[40]  J. Zico Kolter,et al.  Gradient descent GAN optimization is locally stable , 2017, NIPS.

[41]  Yiming Yang,et al.  MMD GAN: Towards Deeper Understanding of Moment Matching Network , 2017, NIPS.

[42]  Charles A. Sutton,et al.  VEEGAN: Reducing Mode Collapse in GANs using Implicit Variational Learning , 2017, NIPS.

[43]  Philip H. S. Torr,et al.  Multi-agent Diverse Generative Adversarial Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[44]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[45]  Guo-Jun Qi,et al.  Loss-Sensitive Generative Adversarial Networks on Lipschitz Densities , 2017, International Journal of Computer Vision.

[46]  Léon Bottou,et al.  Towards Principled Methods for Training Generative Adversarial Networks , 2017, ICLR.

[47]  Ian J. Goodfellow,et al.  NIPS 2016 Tutorial: Generative Adversarial Networks , 2016, ArXiv.

[48]  Raymond Y. K. Lau,et al.  Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[49]  Yoshua Bengio,et al.  Mode Regularized Generative Adversarial Networks , 2016, ICLR.

[50]  David Pfau,et al.  Unrolled Generative Adversarial Networks , 2016, ICLR.

[51]  Surya Ganguli,et al.  Deep Information Propagation , 2016, ICLR.

[52]  Sridhar Mahadevan,et al.  Generative Multi-Adversarial Networks , 2016, ICLR.

[53]  Christian Ledig,et al.  Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[54]  Antonio Torralba,et al.  Generating Videos with Scene Dynamics , 2016, NIPS.

[55]  Surya Ganguli,et al.  On the Expressive Power of Deep Neural Networks , 2016, ICML.

[56]  Surya Ganguli,et al.  Exponential expressivity in deep neural networks through transient chaos , 2016, NIPS.

[57]  Pieter Abbeel,et al.  InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets , 2016, NIPS.

[58]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[59]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[60]  Richard S. Zemel,et al.  Generative Moment Matching Networks , 2015, ICML.

[61]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[62]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[63]  Ce Liu,et al.  Deep Convolutional Neural Network for Image Deconvolution , 2014, NIPS.

[64]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[65]  Bernhard Schölkopf,et al.  A Kernel Two-Sample Test , 2012, J. Mach. Learn. Res..

[66]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[67]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[68]  H. Jeffreys An invariant form for the prior probability in estimation problems , 1946, Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences.

[69]  Asuman Ozdaglar,et al.  Do GANs always have Nash equilibria? , 2020, ICML.

[70]  Robert Mansel Gower Convergence Theorems for Gradient Descent , 2017 .

[71]  G. Crooks On Measures of Entropy and Information , 2015 .

[72]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .