Wasserstein Variational Inference

This paper introduces Wasserstein variational inference, a new form of approximate Bayesian inference based on optimal transport theory. Wasserstein variational inference uses a new family of divergences that includes both f-divergences and the Wasserstein distance as special cases. The gradients of the Wasserstein variational loss are obtained by backpropagating through the Sinkhorn iterations. This technique results in a very stable likelihood-free training method that can be used with implicit distributions and probabilistic programs. Using the Wasserstein variational inference framework, we introduce several new forms of autoencoders and test their robustness and performance against existing variational autoencoding techniques.

[1]  Marco Cuturi,et al.  Sinkhorn Distances: Lightspeed Computation of Optimal Transport , 2013, NIPS.

[2]  Dustin Tran,et al.  Edward: A library for probabilistic modeling, inference, and criticism , 2016, ArXiv.

[3]  Dustin Tran,et al.  Operator Variational Inference , 2016, NIPS.

[4]  Dustin Tran,et al.  Automatic Differentiation Variational Inference , 2016, J. Mach. Learn. Res..

[5]  F. Bach,et al.  Sharp asymptotic and finite-sample rates of convergence of empirical measures in Wasserstein distance , 2017, Bernoulli.

[6]  Richard Sinkhorn,et al.  Concerning nonnegative matrices and doubly stochastic matrices , 1967 .

[7]  Gabriel Peyré,et al.  Learning Generative Models with Sinkhorn Divergences , 2017, AISTATS.

[8]  Gabriel Peyré,et al.  Computational Optimal Transport , 2018, Found. Trends Mach. Learn..

[9]  D. Burago,et al.  A Course in Metric Geometry , 2001 .

[10]  Sebastian Nowozin,et al.  Adversarial Variational Bayes: Unifying Variational Autoencoders and Generative Adversarial Networks , 2017, ICML.

[11]  Dustin Tran,et al.  Hierarchical Implicit Models and Likelihood-Free Variational Inference , 2017, NIPS.

[12]  Ferenc Huszár,et al.  Variational Inference using Implicit Distributions , 2017, ArXiv.

[13]  Hedvig Kjellström,et al.  Advances in Variational Inference , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[15]  D. Draper,et al.  Stochastic Optimization: a Review , 2002 .

[16]  Bernhard Schölkopf,et al.  Wasserstein Auto-Encoders , 2017, ICLR.

[17]  Chong Wang,et al.  Stochastic variational inference , 2012, J. Mach. Learn. Res..

[18]  Klaus-Robert Müller,et al.  Wasserstein Training of Restricted Boltzmann Machines , 2016, NIPS.

[19]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[20]  Aaron C. Courville,et al.  Adversarially Learned Inference , 2016, ICLR.

[21]  Dustin Tran,et al.  Variational Inference via \chi Upper Bound Minimization , 2016, NIPS.

[22]  Richard E. Turner,et al.  Rényi Divergence Variational Inference , 2016, NIPS.

[23]  David M. Blei,et al.  Variational Inference: A Review for Statisticians , 2016, ArXiv.

[24]  Sean Gerrish,et al.  Black Box Variational Inference , 2013, AISTATS.

[25]  Léon Bottou,et al.  Towards Principled Methods for Training Generative Adversarial Networks , 2017, ICLR.

[26]  C. Villani Topics in Optimal Transportation , 2003 .

[27]  Navdeep Jaitly,et al.  Adversarial Autoencoders , 2015, ArXiv.

[28]  Léon Bottou,et al.  Wasserstein Generative Adversarial Networks , 2017, ICML.

[29]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[30]  Zhe Gan,et al.  Variational Autoencoder for Deep Learning of Images, Labels and Captions , 2016, NIPS.

[31]  Manfred Opper,et al.  Perturbative Black Box Variational Inference , 2017, NIPS.