Learning Model Reparametrizations: Implicit Variational Inference by Fitting MCMC distributions

We introduce a new algorithm for approximate inference that combines reparametrization, Markov chain Monte Carlo and variational methods. We construct a very flexible implicit variational distribution synthesized by an arbitrary Markov chain Monte Carlo operation and a deterministic transformation that can be optimized using the reparametrization trick. Unlike current methods for implicit variational inference, our method avoids the computation of log density ratios and therefore it is easily applicable to arbitrary continuous and differentiable models. We demonstrate the proposed algorithm for fitting banana-shaped distributions and for training variational autoencoders.

[1]  Justin Domke,et al.  A Divergence Bound for Hybrids of MCMC and Variational Inference and an Application to Langevin Dynamics and SGVI , 2017, ICML.

[2]  Dustin Tran,et al.  Deep and Hierarchical Implicit Models , 2017, ArXiv.

[3]  Qiang Liu,et al.  Approximate Inference with Amortised MCMC , 2017, ArXiv.

[4]  Ferenc Huszár,et al.  Variational Inference using Implicit Distributions , 2017, ArXiv.

[5]  Sebastian Nowozin,et al.  Adversarial Variational Bayes: Unifying Variational Autoencoders and Generative Adversarial Networks , 2017, ICML.

[6]  Jiqiang Guo,et al.  Stan: A Probabilistic Programming Language. , 2017, Journal of statistical software.

[7]  Dustin Tran,et al.  Automatic Differentiation Variational Inference , 2016, J. Mach. Learn. Res..

[8]  Theofanis Karaletsos,et al.  Adversarial Message Passing For Graphical Models , 2016, ArXiv.

[9]  Dilin Wang,et al.  Learning to Draw Samples: With Application to Amortized MLE for Generative Adversarial Learning , 2016, ArXiv.

[10]  Dustin Tran,et al.  Operator Variational Inference , 2016, NIPS.

[11]  Dilin Wang,et al.  Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm , 2016, NIPS.

[12]  David M. Blei,et al.  Overdispersed Black-Box Variational Inference , 2016, UAI.

[13]  Shakir Mohamed,et al.  Variational Inference with Normalizing Flows , 2015, ICML.

[14]  Max Welling,et al.  Markov Chain Monte Carlo and Variational Inference: Bridging the Gap , 2014, ICML.

[15]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[16]  Miguel Lázaro-Gredilla,et al.  Doubly Stochastic Variational Bayes for non-Conjugate Inference , 2014, ICML.

[17]  Karol Gregor,et al.  Neural Variational Inference and Learning in Belief Networks , 2014, ICML.

[18]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[19]  Sean Gerrish,et al.  Black Box Variational Inference , 2013, AISTATS.

[20]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[21]  Andrew Gelman,et al.  The No-U-turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo , 2011, J. Mach. Learn. Res..

[22]  Chong Wang,et al.  Stochastic variational inference , 2012, J. Mach. Learn. Res..

[23]  Tim Salimans,et al.  Fixed-Form Variational Posterior Approximation through Stochastic Linear Regression , 2012, ArXiv.

[24]  Michael I. Jordan,et al.  Variational Bayesian Inference with Stochastic Search , 2012, ICML.

[25]  Takafumi Kanamori,et al.  Density Ratio Estimation in Machine Learning , 2012 .

[26]  Yee Whye Teh,et al.  Bayesian Learning via Stochastic Gradient Langevin Dynamics , 2011, ICML.

[27]  David Barber,et al.  Concave Gaussian Variational Approximations for Inference in Large-Scale Bayesian Linear Models , 2011, AISTATS.

[28]  Radford M. Neal MCMC Using Hamiltonian Dynamics , 2011, 1206.1901.

[29]  Andrew Gelman,et al.  Handbook of Markov Chain Monte Carlo , 2011 .

[30]  M. Girolami,et al.  Riemann manifold Langevin and Hamiltonian Monte Carlo methods , 2011, Journal of the Royal Statistical Society: Series B (Statistical Methodology).

[31]  Francis R. Bach,et al.  Online Learning for Latent Dirichlet Allocation , 2010, NIPS.

[32]  Matthew King,et al.  A Stochastic approximation method for inference in probabilistic graphical models , 2009, NIPS.

[33]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[34]  Christophe Andrieu,et al.  A tutorial on adaptive MCMC , 2008, Stat. Comput..

[35]  H. Robbins A Stochastic Approximation Method , 1951 .

[36]  Christian P. Robert,et al.  Monte Carlo Statistical Methods (Springer Texts in Statistics) , 2005 .

[37]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[38]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[39]  H. Haario,et al.  An adaptive Metropolis algorithm , 2001 .

[40]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[41]  Peter W. Glynn,et al.  Likelihood ratio gradient estimation for stochastic systems , 1990, CACM.

[42]  G. C. Wei,et al.  A Monte Carlo Implementation of the EM Algorithm and the Poor Man's Data Augmentation Algorithms , 1990 .

[43]  A. Kennedy,et al.  Hybrid Monte Carlo , 1988 .