论文信息 - On Relativistic f-Divergences

On Relativistic f-Divergences

This paper provides a more rigorous look at Relativistic Generative Adversarial Networks (RGANs). We prove that the objective function of the discriminator is a statistical divergence for any concave function $f$ with minimal properties ($f(0)=0$, $f'(0) \neq 0$, $\sup_x f(x)>0$). We also devise a few variants of relativistic $f$-divergences. Wasserstein GAN was originally justified by the idea that the Wasserstein distance (WD) is most sensible because it is weak (i.e., it induces a weak topology). We show that the WD is weaker than $f$-divergences which are weaker than relativistic $f$-divergences. Given the good performance of RGANs, this suggests that WGAN does not performs well primarily because of the weak metric, but rather because of regularization and the use of a relativistic discriminator. We also take a closer look at estimators of relativistic $f$-divergences. We introduce the minimum-variance unbiased estimator (MVUE) for Relativistic paired GANs (RpGANs; originally called RGANs which could bring confusion) and show that it does not perform better. Furthermore, we show that the estimator of Relativistic average GANs (RaGANs) is only asymptotically unbiased, but that the finite-sample bias is small. Removing this bias does not improve performance.

Alexia Jolicoeur-Martineau | Alexia Jolicoeur-Martineau

[1] E. L. Lehmann,et al. Consistency and Unbiasedness of Certain Nonparametric Tests , 1951 .

[2] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .

[3] Guido Montúfar,et al. How Well Do WGANs Estimate the Wasserstein Metric? , 2019, ArXiv.

[4] Yuichi Yoshida,et al. Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.

[5] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[6] Xiaogang Wang,et al. Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[7] W. Hoeffding. A Class of Statistics with Asymptotically Normal Distribution , 1948 .

[8] Sepp Hochreiter,et al. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[9] Timo Aila,et al. A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10] Ali Borji,et al. Pros and Cons of GAN Evaluation Measures , 2018, Comput. Vis. Image Underst..

[11] Sebastian Nowozin,et al. f-GAN: Training Generative Neural Samplers using Variational Divergence Minimization , 2016, NIPS.

[12] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[13] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .

[14] Léon Bottou,et al. Towards Principled Methods for Training Generative Adversarial Networks , 2017, ICLR.

[15] Léon Bottou,et al. Wasserstein Generative Adversarial Networks , 2017, ICML.

[16] Raymond Y. K. Lau,et al. Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[17] Alexia Jolicoeur-Martineau,et al. The relativistic discriminator: a key element missing from standard GAN , 2018, ICLR.

[18] Jeff Donahue,et al. Large Scale GAN Training for High Fidelity Natural Image Synthesis , 2018, ICLR.

[19] A. Müller. Integral Probability Metrics and Their Generating Classes of Functions , 1997, Advances in Applied Probability.

[20] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[21] Weiwei Zhang,et al. Cat Head Detection - How to Effectively Exploit Shape and Texture Features , 2008, ECCV.

[22] Jaakko Lehtinen,et al. Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.