Max-Sliced Wasserstein Distance and Its Use for GANs

Generative adversarial nets (GANs) and variational auto-encoders have significantly improved our distribution modeling capabilities, showing promise for dataset augmentation, image-to-image translation and feature learning. However, to model high-dimensional distributions, sequential training and stacked architectures are common, increasing the number of tunable hyper-parameters as well as the training time. Nonetheless, the sample complexity of the distance metrics remains one of the factors affecting GAN training. We first show that the recently proposed sliced Wasserstein distance has compelling sample complexity properties when compared to the Wasserstein distance. To further improve the sliced Wasserstein distance we then analyze its `projection complexity' and develop the max-sliced Wasserstein distance which enjoys compelling sample complexity while reducing projection complexity, albeit necessitating a max estimation. We finally illustrate that the proposed distance trains GANs on high-dimensional images up to a resolution of 256x256 easily.

[1]  Yinda Zhang,et al.  LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop , 2015, ArXiv.

[2]  Ping Tan,et al.  DualGAN: Unsupervised Dual Learning for Image-to-Image Translation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[3]  Jan Kautz,et al.  Unsupervised Image-to-Image Translation Networks , 2017, NIPS.

[4]  Julien Rabin,et al.  Sliced and Radon Wasserstein Barycenters of Measures , 2014, Journal of Mathematical Imaging and Vision.

[5]  Léon Bottou,et al.  Towards Principled Methods for Training Generative Adversarial Networks , 2017, ICLR.

[6]  C. Villani Optimal Transport: Old and New , 2008 .

[7]  Alexei A. Efros,et al.  Toward Multimodal Image-to-Image Translation , 2017, NIPS.

[8]  Maneesh Kumar Singh,et al.  DRIT++: Diverse Image-to-Image Translation via Disentangled Representations , 2019, International Journal of Computer Vision.

[9]  Vaibhava Goel,et al.  McGan: Mean and Covariance Feature Matching GAN , 2017, ICML.

[10]  Léon Bottou,et al.  Wasserstein GAN , 2017, ArXiv.

[11]  F. Bach,et al.  Sharp asymptotic and finite-sample rates of convergence of empirical measures in Wasserstein distance , 2017, Bernoulli.

[12]  Fei Xia,et al.  Understanding GANs: the LQG Setting , 2017, ArXiv.

[13]  Tomas Pfister,et al.  Learning from Simulated and Unsupervised Images through Adversarial Training , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Guillaume Lample,et al.  Word Translation Without Parallel Data , 2017, ICLR.

[15]  Alexander G. Schwing,et al.  Generative Modeling Using the Sliced Wasserstein Distance , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[16]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[17]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[18]  Trevor Darrell,et al.  Adversarial Feature Learning , 2016, ICLR.

[19]  Ashish Khetan,et al.  PacGAN: The Power of Two Samples in Generative Adversarial Networks , 2017, IEEE Journal on Selected Areas in Information Theory.

[20]  Yingyu Liang,et al.  Generalization and Equilibrium in Generative Adversarial Nets (GANs) , 2017, ICML.

[21]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[22]  Yiannis Demiris,et al.  MAGAN: Margin Adaptation for Generative Adversarial Networks , 2017, ArXiv.

[23]  Heiko Hoffmann,et al.  Sliced Wasserstein Distance for Learning Gaussian Mixture Models , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[24]  Luc Van Gool,et al.  Sliced Wasserstein Generative Models , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Yiming Yang,et al.  MMD GAN: Towards Deeper Understanding of Moment Matching Network , 2017, NIPS.

[26]  Inbar Mosseri,et al.  XGAN: Unsupervised Image-to-Image Translation for many-to-many Mappings , 2017, Domain Adaptation for Visual Understanding.

[27]  Tom Sercu,et al.  Fisher GAN , 2017, NIPS.

[28]  Alexei A. Efros,et al.  Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[29]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Gustavo K. Rohde,et al.  The Radon Cumulative Distribution Transform and Its Application to Image Classification , 2015, IEEE Transactions on Image Processing.

[31]  David Berthelot,et al.  BEGAN: Boundary Equilibrium Generative Adversarial Networks , 2017, ArXiv.

[32]  Jan Kautz,et al.  Multimodal Unsupervised Image-to-Image Translation , 2018, ECCV.

[33]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[34]  Sebastian Nowozin,et al.  Which Training Methods for GANs do actually Converge? , 2018, ICML.

[35]  Gustavo K. Rohde,et al.  Sliced-Wasserstein Autoencoder: An Embarrassingly Simple Generative Model , 2018, ArXiv.

[36]  Han Zhang,et al.  Improving GANs Using Optimal Transport , 2018, ICLR.

[37]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[38]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.