Sliced Wasserstein Generative Models

In generative modeling, the Wasserstein distance (WD) has emerged as a useful metric to measure the discrepancy between generated and real data distributions. Unfortunately, it is challenging to approximate the WD of high-dimensional distributions. In contrast, the sliced Wasserstein distance (SWD) factorizes high-dimensional distributions into their multiple one-dimensional marginal distributions and is thus easier to approximate. In this paper, we introduce novel approximations of the primal and dual SWD. Instead of using a large number of random projections, as it is done by conventional SWD approximation methods, we propose to approximate SWDs with a small number of parameterized orthogonal projections in an end-to-end deep learning fashion. As concrete applications of our SWD approximations, we design two types of differentiable SWD blocks to equip modern generative frameworks---Auto-Encoders (AE) and Generative Adversarial Networks (GAN). In the experiments, we not only show the superiority of the proposed generative models on standard image synthesis benchmarks, but also demonstrate the state-of-the-art performance on challenging high resolution image and video generation in an unsupervised manner.

[1]  Kurt Hornik,et al.  Approximation capabilities of multilayer feedforward networks , 1991, Neural Networks.

[2]  Alan Edelman,et al.  The Geometry of Algorithms with Orthogonality Constraints , 1998, SIAM J. Matrix Anal. Appl..

[3]  François Pitié,et al.  Automated colour grading using colour distribution transfer , 2007, Comput. Vis. Image Underst..

[4]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[5]  Levent Tunçel,et al.  Optimization algorithms on matrix manifolds , 2009, Math. Comput..

[6]  Julien Rabin,et al.  Wasserstein Barycenter and Its Application to Texture Mixing , 2011, SSVM.

[7]  Nicolas Bonnotte Unidimensional and Evolution Methods for Optimal Transportation , 2013 .

[8]  Bamdev Mishra,et al.  Manopt, a matlab toolbox for optimization on manifolds , 2013, J. Mach. Learn. Res..

[9]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[10]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[11]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[12]  Julien Rabin,et al.  Sliced and Radon Wasserstein Barycenters of Measures , 2014, Journal of Mathematical Imaging and Vision.

[13]  Yinda Zhang,et al.  LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop , 2015, ArXiv.

[14]  R. Castro The Empirical Distribution Function and the Histogram , 2015 .

[15]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[16]  Max Welling,et al.  Markov Chain Monte Carlo and Variational Inference: Bridging the Gap , 2014, ICML.

[17]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[18]  Navdeep Jaitly,et al.  Adversarial Autoencoders , 2015, ArXiv.

[19]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[20]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[21]  Yang Zou,et al.  Sliced Wasserstein Kernels for Probability Distributions , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Antonio Torralba,et al.  Generating Videos with Scene Dynamics , 2016, NIPS.

[23]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[24]  Sebastian Nowozin,et al.  Adversarial Variational Bayes: Unifying Variational Autoencoders and Generative Adversarial Networks , 2017, ICML.

[25]  Raymond Y. K. Lau,et al.  Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[26]  Gustavo K. Rohde,et al.  Optimal Mass Transport: Signal processing and machine-learning applications , 2017, IEEE Signal Processing Magazine.

[27]  David Berthelot,et al.  BEGAN: Boundary Equilibrium Generative Adversarial Networks , 2017, ArXiv.

[28]  Léon Bottou,et al.  Wasserstein Generative Adversarial Networks , 2017, ICML.

[29]  Luc Van Gool,et al.  A Riemannian Network for SPD Matrix Learning , 2016, AAAI.

[30]  Gabriel Peyré,et al.  Sinkhorn-AutoDiff: Tractable Wasserstein Learning of Generative Models , 2017 .

[31]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[32]  Jonas Adler,et al.  Banach Wasserstein GAN , 2018, NeurIPS.

[33]  Xiang Wei,et al.  Improving the Improved Training of Wasserstein GANs: A Consistency Term and Its Dual Effect , 2018, ICLR.

[34]  Bernhard Schölkopf,et al.  Wasserstein Auto-Encoders , 2017, ICLR.

[35]  Han Zhang,et al.  Improving GANs Using Optimal Transport , 2018, ICLR.

[36]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[37]  Yuichi Yoshida,et al.  Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.

[38]  Gabriel Peyré,et al.  Learning Generative Models with Sinkhorn Divergences , 2017, AISTATS.

[39]  Luc Van Gool,et al.  Wasserstein Divergence for GANs , 2017, ECCV.

[40]  Jan Kautz,et al.  Video-to-Video Synthesis , 2018, NeurIPS.

[41]  Mario Lucic,et al.  Are GANs Created Equal? A Large-Scale Study , 2017, NeurIPS.

[42]  Dimitris Samaras,et al.  A Two-Step Computation of the Exact GAN Wasserstein Distance , 2018, ICML.

[43]  Luca Ambrogioni,et al.  Wasserstein Variational Inference , 2018, NeurIPS.

[44]  Jan Kautz,et al.  MoCoGAN: Decomposing Motion and Content for Video Generation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[45]  Alexander G. Schwing,et al.  Generative Modeling Using the Sliced Wasserstein Distance , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[46]  Gustavo K. Rohde,et al.  Sliced Wasserstein Auto-Encoders , 2018, ICLR.