The relativistic discriminator: a key element missing from standard GAN

In standard generative adversarial network (SGAN), the discriminator estimates the probability that the input data is real. The generator is trained to increase the probability that fake data is real. We argue that it should also simultaneously decrease the probability that real data is real because 1) this would account for a priori knowledge that half of the data in the mini-batch is fake, 2) this would be observed with divergence minimization, and 3) in optimal settings, SGAN would be equivalent to integral probability metric (IPM) GANs. We show that this property can be induced by using a relativistic discriminator which estimate the probability that the given real data is more realistic than a randomly sampled fake data. We also present a variant in which the discriminator estimate the probability that the given real data is more realistic than fake data, on average. We generalize both approaches to non-standard GAN loss functions and we refer to them respectively as Relativistic GANs (RGANs) and Relativistic average GANs (RaGANs). We show that IPM-based GANs are a subset of RGANs which use the identity function. Empirically, we observe that 1) RGANs and RaGANs are significantly more stable and generate higher quality data samples than their non-relativistic counterparts, 2) Standard RaGAN with gradient penalty generate data of better quality than WGAN-GP while only requiring a single discriminator update per generator update (reducing the time taken for reaching the state-of-the-art by 400%), and 3) RaGANs are able to generate plausible high resolutions images (256x256) from a very small sample (N=2011), while GAN and LSGAN cannot; these images are of significantly better quality than the ones generated by WGAN-GP and SGAN with spectral normalization.

[1]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[2]  Jacob Abernethy,et al.  On Convergence and Stability of GANs , 2018 .

[3]  Sungroh Yoon,et al.  How Generative Adversarial Nets and its variants Work: An Overview of GAN , 2017, ArXiv.

[4]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[5]  Ashish Khetan,et al.  PacGAN: The Power of Two Samples in Generative Adversarial Networks , 2017, IEEE Journal on Selected Areas in Information Theory.

[6]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[7]  Andrew M. Dai,et al.  Many Paths to Equilibrium: GANs Do Not Need to Decrease a Divergence At Every Step , 2017, ICLR.

[8]  Sebastian Nowozin,et al.  f-GAN: Training Generative Neural Samplers using Variational Divergence Minimization , 2016, NIPS.

[9]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[10]  Raymond Y. K. Lau,et al.  Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[11]  A. Müller Integral Probability Metrics and Their Generating Classes of Functions , 1997, Advances in Applied Probability.

[12]  Léon Bottou,et al.  Towards Principled Methods for Training Generative Adversarial Networks , 2017, ICLR.

[13]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[14]  Yuichi Yoshida,et al.  Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.

[15]  Sungroh Yoon,et al.  How Generative Adversarial Networks and Their Variants Work , 2017, ACM Comput. Surv..

[16]  Léon Bottou,et al.  Wasserstein Generative Adversarial Networks , 2017, ICML.

[17]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[18]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[19]  Weiwei Zhang,et al.  Cat Head Detection - How to Effectively Exploit Shape and Texture Features , 2008, ECCV.

[20]  Ali Borji,et al.  Pros and Cons of GAN Evaluation Measures , 2018, Comput. Vis. Image Underst..

[21]  Alexia Jolicoeur-Martineau,et al.  GANs beyond divergence minimization , 2018, ArXiv.

[22]  Mario Lucic,et al.  Are GANs Created Equal? A Large-Scale Study , 2017, NeurIPS.

[23]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[24]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[25]  Jacob D. Abernethy,et al.  How to Train Your DRAGAN , 2017, ArXiv.