Ae-OT: a New Generative Model based on Extended Semi-discrete Optimal transport

Generative adversarial networks (GANs) have attracted huge attention due to its capability to generate visual realistic images. However, most of the existing models suffer from the mode collapse or mode mixture problems. In this work, we give a theoretic explanation of the both problems by Figalli’s regularity theory of optimal transportation maps. Basically, the generator compute the transportation maps between the white noise distributions and the data distributions, which are in general discontinuous. However, DNNs can only represent continuous maps. This intrinsic conflict induces mode collapse and mode mixture. In order to tackle the both problems, we explicitly separate the manifold embedding and the optimal transportation; the first part is carried out using an autoencoder to map the images onto the latent space; the second part is accomplished using a GPU-based convex optimization to find the discontinuous transportation maps. Composing the extended OT map and the decoder, we can finally generate new images from the white noise. This AE-OT model avoids representing discontinuous maps by DNNs, therefore effectively prevents mode collapse and mode mixture.

[1]  Gabriel Peyré,et al.  Computational Optimal Transport , 2018, Found. Trends Mach. Learn..

[2]  C. Villani Optimal Transport: Old and New , 2008 .

[3]  Olivier Bachem,et al.  Assessing Generative Models via Precision and Recall , 2018, NeurIPS.

[4]  Léon Bottou,et al.  Wasserstein Generative Adversarial Networks , 2017, ICML.

[5]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[6]  Shing-Tung Yau,et al.  A Geometric View of Optimal Transportation and Generative Model , 2017, Comput. Aided Geom. Des..

[7]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[8]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[9]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[10]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[11]  J. Zico Kolter,et al.  Gradient descent GAN optimization is locally stable , 2017, NIPS.

[12]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[13]  Mario Lucic,et al.  Are GANs Created Equal? A Large-Scale Study , 2017, NeurIPS.

[14]  David Lopez-Paz,et al.  Optimizing the Latent Space of Generative Networks , 2017, ICML.

[15]  Jitendra Malik,et al.  Non-Adversarial Image Synthesis With Generative Latent Nearest Neighbors , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  S. Yau,et al.  Variational Principles for Minkowski Type Problems, Discrete Optimal Transport, and Discrete Monge-Ampere Equations , 2013, 1302.5472.

[17]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[18]  Xiaoou Tang,et al.  From Facial Expression Recognition to Interpersonal Relation Prediction , 2016, International Journal of Computer Vision.

[19]  Sebastian Nowozin,et al.  Which Training Methods for GANs do actually Converge? , 2018, ICML.

[20]  Koby Crammer,et al.  A theory of learning from different domains , 2010, Machine Learning.

[21]  Jitendra Malik,et al.  Implicit Maximum Likelihood Estimation , 2018, ArXiv.

[22]  A. Figalli Regularity Properties of Optimal Maps Between Nonconvex Domains in the Plane , 2010 .

[23]  F. Bach,et al.  Sharp asymptotic and finite-sample rates of convergence of empirical measures in Wasserstein distance , 2017, Bernoulli.

[24]  Xiao Zhang,et al.  Normalized Diversification , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Yuichi Yoshida,et al.  Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.

[26]  Changxi Zheng,et al.  BourGAN: Generative Networks with Metric Embeddings , 2018, NeurIPS.

[27]  A. Figalli,et al.  Partial $W^{2,p}$ regularity for optimal transport maps , 2016, 1606.05173.

[28]  Luis A. Caffarelli,et al.  A localization property of viscosity solutions to the Monge-Ampere equation and their strict convexity , 1990 .

[29]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[30]  Maneesh Kumar Singh,et al.  Disconnected Manifold Learning for Generative Adversarial Networks , 2018, NeurIPS.

[31]  Bernhard Schölkopf,et al.  Wasserstein Auto-Encoders , 2017, ICLR.

[32]  Ashish Khetan,et al.  PacGAN: The Power of Two Samples in Generative Adversarial Networks , 2017, IEEE Journal on Selected Areas in Information Theory.

[33]  Y. Brenier Polar Factorization and Monotone Rearrangement of Vector-Valued Functions , 1991 .

[34]  Roland Vollgraf,et al.  Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[35]  L. Caffarelli Some regularity properties of solutions of Monge Ampère equation , 1991 .

[36]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.