Top-k Training of GANs: Improving GAN Performance by Throwing Away Bad Samples

We introduce a simple (one line of code) modification to the Generative Adversarial Network (GAN) training algorithm that materially improves results with no increase in computational cost: When updating the generator parameters, we simply zero out the gradient contributions from the elements of the batch that the critic scores as `least realistic'. Through experiments on many different GAN variants, we show that this `top-k update' procedure is a generally applicable improvement. In order to understand the nature of the improvement, we conduct extensive analysis on a simple mixture-of-Gaussians dataset and discover several interesting phenomena. Among these is that, when gradient updates are computed using the worst-scoring batch elements, samples can actually be pushed further away from their nearest mode. We also apply our method to recent GAN variants and improve state-of-the-art FID for conditional generation from 9.21 to 8.57 on CIFAR-10.

[1]  Yuichi Yoshida,et al.  Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.

[2]  Stefano Ermon,et al.  Bridging the Gap Between $f$-GANs and Wasserstein GANs , 2019, ICML.

[3]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[4]  Mohamed A. Soliman,et al.  Top-k Query Processing in Uncertain Databases , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[5]  David Pfau,et al.  Unrolled Generative Adversarial Networks , 2016, ICLR.

[6]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[7]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[8]  Yan Wu,et al.  LOGAN: Latent Optimisation for Generative Adversarial Networks , 2019, ArXiv.

[9]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[10]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[11]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[12]  Christian Ledig,et al.  Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Sameer Singh,et al.  Generating Natural Adversarial Examples , 2017, ICLR.

[14]  Yoshua Bengio,et al.  Maximum Entropy Generators for Energy-Based Models , 2019, ArXiv.

[15]  Jaakko Lehtinen,et al.  Analyzing and Improving the Image Quality of StyleGAN , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Yoshua Bengio,et al.  Small-GAN: Speeding Up GAN Training Using Core-sets , 2019, ICML.

[17]  D. Dowson,et al.  The Fréchet distance between multivariate normal distributions , 1982 .

[18]  Ian J. Goodfellow,et al.  Skill Rating for Generative Models , 2018, ArXiv.

[19]  David M. Blei,et al.  Prescribed Generative Adversarial Networks , 2019, ArXiv.

[20]  Tatjana Chavdarova,et al.  Reducing Noise in GAN Training with Variance Reduced Extragradient , 2019, NeurIPS.

[21]  Chris Donahue,et al.  Adversarial Audio Synthesis , 2018, ICLR.

[22]  Sebastian Nowozin,et al.  f-GAN: Training Generative Neural Samplers using Variational Divergence Minimization , 2016, NIPS.

[23]  Sameer Singh,et al.  Image Augmentations for GAN Training , 2020, ArXiv.

[24]  Siwei Ma,et al.  Mode Seeking Generative Adversarial Networks for Diverse Image Synthesis , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Jonathon Shlens,et al.  Conditional Image Synthesis with Auxiliary Classifier GANs , 2016, ICML.

[26]  Chris Donahue,et al.  Synthesizing Audio with Generative Adversarial Networks , 2018, ArXiv.

[27]  Raymond Y. K. Lau,et al.  Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[28]  Colin Raffel,et al.  Towards GAN Benchmarks Which Require Generalization , 2020, ICLR.

[29]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[30]  Trevor Darrell,et al.  Adversarial Feature Learning , 2016, ICLR.

[31]  Vatsal Shah,et al.  Choosing the Sample with Lowest Loss makes SGD Robust , 2020, AISTATS.

[32]  Marc G. Bellemare,et al.  The Cramer Distance as a Solution to Biased Wasserstein Gradients , 2017, ArXiv.

[33]  Yair Weiss,et al.  On GANs and GMMs , 2018, NeurIPS.

[34]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[35]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[36]  Yiming Yang,et al.  MMD GAN: Towards Deeper Understanding of Moment Matching Network , 2017, NIPS.

[37]  Alexandros G. Dimakis,et al.  Your Local GAN: Designing Two Dimensional Local Attention Mechanisms for Generative Models , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Colin Raffel,et al.  Top-K Training of GANs: Improving Generators by Making Critics Less Critical , 2020, ArXiv.

[40]  Léon Bottou,et al.  Wasserstein GAN , 2017, ArXiv.

[41]  Bernt Schiele,et al.  Feature Generating Networks for Zero-Shot Learning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[42]  Gauthier Gidel,et al.  A Variational Inequality Perspective on Generative Adversarial Networks , 2018, ICLR.

[43]  Walid G. Aref,et al.  Supporting top-kjoin queries in relational databases , 2004, The VLDB Journal.

[44]  Tatjana Chavdarova,et al.  SGAN: An Alternative Training of Generative Adversarial Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[45]  Xiaohua Zhai,et al.  A Large-Scale Study on Regularization and Normalization in GANs , 2018, ICML.

[46]  Honglak Lee,et al.  Consistency Regularization for Generative Adversarial Networks , 2020, ICLR.

[47]  Andrew Zisserman,et al.  Smooth Loss Functions for Deep Top-k Classification , 2018, ICLR.

[48]  Honglak Lee,et al.  Improved Consistency Regularization for GANs , 2020, AAAI.

[49]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[50]  Dimitris N. Metaxas,et al.  StackGAN: Text to Photo-Realistic Image Synthesis with Stacked Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[51]  Timo Aila,et al.  A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Christopher Joseph Pal,et al.  Sparse Attentive Backtracking: Temporal CreditAssignment Through Reminding , 2018, NeurIPS.

[53]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[54]  Sergey Ioffe,et al.  Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.

[55]  Jeff Donahue,et al.  Large Scale GAN Training for High Fidelity Natural Image Synthesis , 2018, ICLR.

[56]  Sergey Levine,et al.  Recurrent Independent Mechanisms , 2019, ICLR.

[57]  Trevor Darrell,et al.  Discriminator Rejection Sampling , 2018, ICLR.

[58]  Augustus Odena,et al.  Open Questions about Generative Adversarial Networks , 2019, Distill.

[59]  Chuan Sheng Foo,et al.  Efficient GAN-Based Anomaly Detection , 2018, ArXiv.

[60]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[61]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[62]  Geoffrey E. Hinton,et al.  Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer , 2017, ICLR.

[63]  Han Zhang,et al.  Self-Attention Generative Adversarial Networks , 2018, ICML.