Sliced-Wasserstein Flows: Nonparametric Generative Modeling via Optimal Transport and Diffusions

By building upon the recent theory that established the connection between implicit generative modeling (IGM) and optimal transport, in this study, we propose a novel parameter-free algorithm for learning the underlying distributions of complicated datasets and sampling from them. The proposed algorithm is based on a functional optimization problem, which aims at finding a measure that is close to the data distribution as much as possible and also expressive enough for generative modeling purposes. We formulate the problem as a gradient flow in the space of probability measures. The connections between gradient flows and stochastic differential equations let us develop a computationally efficient algorithm for solving the optimization problem. We provide formal theoretical analysis where we prove finite-time error guarantees for the proposed algorithm. To the best of our knowledge, the proposed algorithm is the first nonparametric IGM algorithm with explicit theoretical guarantees. Our experimental results support our theory and show that our algorithm is able to successfully capture the structure of different types of data distributions.

[1]  P. Diggle,et al.  Monte Carlo Methods of Inference for Implicit Statistical Models , 1984 .

[2]  D. Kinderlehrer,et al.  THE VARIATIONAL FORMULATION OF THE FOKKER-PLANCK EQUATION , 1996 .

[3]  Mireille Bossy,et al.  A stochastic particle method for the McKean-Vlasov and the Burgers equation , 1997, Math. Comput..

[4]  Yann Brenier,et al.  A computational fluid mechanics solution to the Monge-Kantorovich mass transfer problem , 2000, Numerische Mathematik.

[5]  F. Malrieu Convergence to equilibrium for granular media equations and their Euler schemes , 2003 .

[6]  L. Ambrosio,et al.  Gradient Flows: In Metric Spaces and in the Space of Probability Measures , 2005 .

[7]  P. Cattiaux,et al.  Probabilistic approach for granular media equations in the non-uniformly convex case , 2006, math/0603541.

[8]  A. Veretennikov,et al.  On Ergodic Measures for McKean-Vlasov Stochastic Equations , 2006 .

[9]  David L. Donoho,et al.  Observed universality of phase transitions in high-dimensional geometry, with implications for modern data analysis and signal processing , 2009, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[10]  Julien Rabin,et al.  Wasserstein Barycenter and Its Application to Texture Mixing , 2011, SSVM.

[11]  Yee Whye Teh,et al.  Bayesian Learning via Stochastic Gradient Langevin Dynamics , 2011, ICML.

[12]  Marco Cuturi,et al.  Sinkhorn Distances: Lightspeed Computation of Optimal Transport , 2013, NIPS.

[13]  Nicolas Bonnotte Unidimensional and Evolution Methods for Optimal Transportation , 2013 .

[14]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[15]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[16]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[17]  Filippo Santambrogio,et al.  Introduction to optimal transport theory , 2010, Optimal Transport.

[18]  A. Dalalyan Theoretical guarantees for approximate sampling from smooth and log‐concave densities , 2014, 1412.7392.

[19]  Julien Rabin,et al.  Sliced and Radon Wasserstein Barycenters of Measures , 2014, Journal of Mathematical Imaging and Vision.

[20]  Tianqi Chen,et al.  A Complete Recipe for Stochastic Gradient MCMC , 2015, NIPS.

[21]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[22]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[23]  Eric Moulines,et al.  Stochastic Gradient Richardson-Romberg Markov Chain Monte Carlo , 2016, NIPS.

[24]  Shakir Mohamed,et al.  Learning in Implicit Generative Models , 2016, ArXiv.

[25]  Gabriel Peyré,et al.  Stochastic Optimization for Large-scale Optimal Transport , 2016, NIPS.

[26]  F. Santambrogio {Euclidean, metric, and Wasserstein} gradient flows: an overview , 2016, 1609.03890.

[27]  A. Veretennikov,et al.  Existence and uniqueness theorems for solutions of McKean–Vlasov stochastic equations , 2016 .

[28]  O. Bousquet,et al.  From optimal transport to generative modeling: the VEGAN cookbook , 2017, 1705.07642.

[29]  Umut Simsekli,et al.  Fractional Langevin Monte Carlo: Exploring Levy Driven Stochastic Differential Equations for Markov Chain Monte Carlo , 2017, ICML.

[30]  Matus Telgarsky,et al.  Non-convex learning via Stochastic Gradient Langevin Dynamics: a nonasymptotic analysis , 2017, COLT.

[31]  Léon Bottou,et al.  Wasserstein Generative Adversarial Networks , 2017, ICML.

[32]  Kamalika Chaudhuri,et al.  Approximation and Convergence Properties of Generative Adversarial Learning , 2017, NIPS.

[33]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[34]  Roland Vollgraf,et al.  Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[35]  Marco Cuturi,et al.  GAN and VAE from an Optimal Transport Point of View , 2017, 1706.01807.

[36]  Bernhard Schölkopf,et al.  Wasserstein Auto-Encoders , 2017, ICLR.

[37]  Gustavo K. Rohde,et al.  Sliced-Wasserstein Autoencoder: An Embarrassingly Simple Generative Model , 2018, ArXiv.

[38]  Gabriel Peyré,et al.  Learning Generative Models with Sinkhorn Divergences , 2017, AISTATS.

[39]  Ali Taylan Cemgil,et al.  Asynchronous Stochastic Quasi-Newton MCMC for Non-Convex Optimization , 2018, ICML.

[40]  Justin Solomon,et al.  Dynamical optimal transport on discrete surfaces , 2018, ACM Trans. Graph..

[41]  Alexander G. Schwing,et al.  Generative Modeling Using the Sliced Wasserstein Distance , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[42]  Luc Van Gool,et al.  Sliced Wasserstein Generative Models , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Gaël Richard,et al.  Non-Asymptotic Analysis of Fractional Langevin Monte Carlo for Non-Convex Optimization , 2019, ICML.

[44]  Shing-Tung Yau,et al.  A Geometric View of Optimal Transportation and Generative Model , 2017, Comput. Aided Geom. Des..

[45]  Changyou Chen,et al.  Stochastic Particle-Optimization Sampling and the Non-Asymptotic Convergence Theory , 2018, AISTATS.

[46]  Rémi Gribonval,et al.  Compressive Statistical Learning with Random Feature Moments , 2017, Mathematical Statistics and Learning.

[47]  Nan Yang,et al.  Relaxed Wasserstein with Applications to GANs , 2017, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).