Improving GAN Training with Probability Ratio Clipping and Sample Reweighting

Despite success on a wide range of problems related to vision, generative adversarial networks (GANs) can suffer from inferior performance due to unstable training, especially for text generation. We propose a new variational GAN training framework which enjoys superior training stability. Our approach is inspired by a connection of GANs and reinforcement learning under a variational perspective. The connection leads to (1) probability ratio clipping that regularizes generator training to prevent excessively large updates, and (2) a sample re-weighting mechanism that stabilizes discriminator training by downplaying bad-quality fake samples. We provide theoretical analysis on the convergence of our approach. By plugging the training approach in diverse state-of-the-art GAN architectures, we obtain significantly improved performance over a range of tasks, including text generation, text style transfer, and image generation.

[1]  Jonathon Shlens,et al.  Conditional Image Synthesis with Auxiliary Classifier GANs , 2016, ICML.

[2]  Tom M. Mitchell,et al.  Learning Data Manipulation for Augmentation and Weighting , 2019, NeurIPS.

[3]  Lantao Yu,et al.  SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient , 2016, AAAI.

[4]  A. Krizhevsky Convolutional Deep Belief Networks on CIFAR-10 , 2010 .

[5]  Geoffrey E. Hinton,et al.  A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants , 1998, Learning in Graphical Models.

[6]  Eric P. Xing,et al.  Learning from All Types of Experiences: A Unifying Machine Learning Perspective , 2020, KDD.

[7]  Xiang Wei,et al.  Improving the Improved Training of Wasserstein GANs: A Consistency Term and Its Dual Effect , 2018, ICLR.

[8]  Eric Xing,et al.  Connecting the Dots Between MLE and RL for Sequence Prediction , 2019 .

[9]  Kilian Q. Weinberger,et al.  An empirical study on evaluation metrics of generative adversarial networks , 2018, ArXiv.

[10]  Geoffrey E. Hinton,et al.  The "wake-sleep" algorithm for unsupervised neural networks. , 1995, Science.

[11]  Stefano Ermon,et al.  Bridging the Gap Between $f$-GANs and Wasserstein GANs , 2019, ICML.

[12]  Sergey Levine,et al.  Reinforcement Learning and Control as Probabilistic Inference: Tutorial and Review , 2018, ArXiv.

[13]  Tong Che,et al.  Your GAN is Secretly an Energy-based Model and You Should use Discriminator Driven Latent Sampling , 2020, NeurIPS.

[14]  Kenji Fukumizu,et al.  Smoothness and Stability in GANs , 2020, ICLR.

[15]  Marc Toussaint,et al.  On Stochastic Optimal Control and Reinforcement Learning by Approximate Inference , 2012, Robotics: Science and Systems.

[16]  Zhou Yu,et al.  Structured Content Preservation for Unsupervised Text Style Transfer , 2018, ArXiv.

[17]  Guillaume Lample,et al.  Multiple-Attribute Text Style Transfer , 2018, ArXiv.

[18]  Lantao Yu,et al.  Understanding the Effectiveness of Lipschitz-Continuity in Generative Adversarial Nets , 2018, 1807.00751.

[19]  Myle Ott,et al.  Residual Energy-Based Models for Text Generation , 2020, ICLR.

[20]  Sebastian Nowozin,et al.  f-GAN: Training Generative Neural Samplers using Variational Divergence Minimization , 2016, NIPS.

[21]  Radu Soricut,et al.  Cold-Start Reinforcement Learning with Softmax Policy Gradient , 2017, NIPS.

[22]  Eric P. Xing,et al.  Connecting the Dots Between MLE and RL for Sequence Generation , 2018, DeepRLStructPred@ICLR.

[23]  Yee Whye Teh,et al.  The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.

[24]  Eric Xing,et al.  Deep Generative Models with Learnable Knowledge Constraints , 2018, NeurIPS.

[25]  David Berthelot,et al.  BEGAN: Boundary Equilibrium Generative Adversarial Networks , 2017, ArXiv.

[26]  Geoffrey E. Hinton,et al.  Using Expectation-Maximization for Reinforcement Learning , 1997, Neural Computation.

[27]  Yoshua Bengio,et al.  Deep Directed Generative Models with Energy-Based Probability Estimation , 2016, ArXiv.

[28]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[29]  Yann LeCun,et al.  Deep multi-scale video prediction beyond mean square error , 2015, ICLR.

[30]  Eric P. Xing,et al.  Unsupervised Text Style Transfer using Language Models as Discriminators , 2018, NeurIPS.

[31]  Yuval Tassa,et al.  Maximum a Posteriori Policy Optimisation , 2018, ICLR.

[32]  Jeff Donahue,et al.  Large Scale GAN Training for High Fidelity Natural Image Synthesis , 2018, ICLR.

[33]  Philip H. S. Torr,et al.  Stable Rank Normalization for Improved Generalization in Neural Networks and GANs , 2019, ICLR.

[34]  Joelle Pineau,et al.  Language GANs Falling Short , 2018, ICLR.

[35]  Yann LeCun,et al.  Energy-based Generative Adversarial Network , 2016, ICLR.

[36]  Stefan Winkler,et al.  The Unusual Effectiveness of Averaging in GAN Training , 2018, ICLR.

[37]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[38]  Enhong Chen,et al.  Style Transfer as Unsupervised Machine Translation , 2018, ArXiv.

[39]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[40]  Dávid Terjék Adversarial Lipschitz Regularization , 2020, ICLR.

[41]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[42]  Yong Yu,et al.  Long Text Generation via Adversarial Training with Leaked Information , 2017, AAAI.

[43]  Andrew M. Dai,et al.  MaskGAN: Better Text Generation via Filling in the ______ , 2018, ICLR.

[44]  Eric P. Xing,et al.  Texar: A Modularized, Versatile, and Extensible Toolkit for Text Generation , 2018, ACL.

[45]  Ruslan Salakhutdinov,et al.  Importance Weighted Autoencoders , 2015, ICLR.

[46]  Eric P. Xing,et al.  Toward Controlled Generation of Text , 2017, ICML.

[47]  Jan Peters,et al.  A Survey on Policy Search for Robotics , 2013, Found. Trends Robotics.

[48]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[49]  Lantao Yu,et al.  Lipschitz Generative Adversarial Nets , 2019, ICML.

[50]  Ben Poole,et al.  Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.

[51]  Ivan P. Yamshchikov,et al.  Style Transfer for Texts: to Err is Human, but Error Margins Matter , 2019, IJCNLP 2019.

[52]  Shiyu Chang,et al.  AutoGAN: Neural Architecture Search for Generative Adversarial Networks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[53]  Sergey Levine,et al.  Trust Region Policy Optimization , 2015, ICML.

[54]  Nina Narodytska,et al.  RelGAN: Relational Generative Adversarial Networks for Text Generation , 2019, ICLR.

[55]  Ivan P. Yamshchikov,et al.  Style Transfer for Texts: Retrain, Report Errors, Compare with Rewrites , 2019, EMNLP/IJCNLP.

[56]  Eric P. Xing,et al.  On Unifying Deep Generative Models , 2017, ICLR.

[57]  Yoshua Bengio,et al.  Maximum-Likelihood Augmented Discrete Generative Adversarial Networks , 2017, ArXiv.

[58]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[59]  Yuichi Yoshida,et al.  Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.

[60]  Richard Nock,et al.  f-GANs in an Information Geometric Nutshell , 2017, NIPS.

[61]  Eric Horvitz,et al.  Bias Correction of Learned Generative Models using Likelihood-Free Importance Weighting , 2019, DGS@ICLR.

[62]  Regina Barzilay,et al.  Style Transfer from Non-Parallel Text by Cross-Alignment , 2017, NIPS.

[63]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[64]  Percy Liang,et al.  Delete, Retrieve, Generate: a Simple Approach to Sentiment and Style Transfer , 2018, NAACL.

[65]  Léon Bottou,et al.  Wasserstein Generative Adversarial Networks , 2017, ICML.