Semi-Implicit Variational Inference

Semi-implicit variational inference (SIVI) is introduced to expand the commonly used analytic variational distribution family, by mixing the variational parameter with a flexible distribution. This mixing distribution can assume any density function, explicit or not, as long as independent random samples can be generated via reparameterization. Not only does SIVI expand the variational family to incorporate highly flexible variational distributions, including implicit ones that have no analytic density functions, but also sandwiches the evidence lower bound (ELBO) between a lower bound and an upper bound, and further derives an asymptotically exact surrogate ELBO that is amenable to optimization via stochastic gradient ascent. With a substantially expanded variational family and a novel optimization algorithm, SIVI is shown to closely match the accuracy of MCMC in inferring the posterior in a variety of Bayesian inference tasks.

[1]  Ruslan Salakhutdinov,et al.  On the quantitative analysis of deep belief networks , 2008, ICML '08.

[2]  Ruslan Salakhutdinov,et al.  Importance Weighted Autoencoders , 2015, ICLR.

[3]  Iain Murray,et al.  Masked Autoregressive Flow for Density Estimation , 2017, NIPS.

[4]  Sebastian Nowozin,et al.  Adversarial Variational Bayes: Unifying Variational Autoencoders and Generative Adversarial Networks , 2017, ICML.

[5]  Karol Gregor,et al.  Neural Variational Inference and Learning in Belief Networks , 2014, ICML.

[6]  Dilin Wang,et al.  Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm , 2016, NIPS.

[7]  Dustin Tran,et al.  Hierarchical Variational Models , 2015, ICML.

[8]  Neil D. Lawrence,et al.  Approximating Posterior Distributions in Belief Networks Using Mixtures , 1997, NIPS.

[9]  Alex Graves,et al.  DRAW: A Recurrent Neural Network For Image Generation , 2015, ICML.

[10]  Michael I. Jordan,et al.  Linear Response Methods for Accurate Covariance Estimates from Mean Field Variational Bayes , 2015, NIPS.

[11]  Mingyuan Zhou Softplus Regressions and Convex Polytopes , 2016, 1608.06383.

[12]  Masatoshi Uehara,et al.  Generative Adversarial Nets from a Density Ratio Estimation Perspective , 2016, 1610.02920.

[13]  James G. Scott,et al.  Bayesian Inference for Logistic Models Using Pólya–Gamma Latent Variables , 2012, 1205.0310.

[14]  Chong Wang,et al.  Stochastic variational inference , 2012, J. Mach. Learn. Res..

[15]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[16]  David M. Blei,et al.  The Generalized Reparameterization Gradient , 2016, NIPS.

[17]  David B. Dunson,et al.  Lognormal and Gamma Mixed Negative Binomial Regression , 2012, ICML.

[18]  David M. Blei,et al.  Variational Inference: A Review for Statisticians , 2016, ArXiv.

[19]  David B. Dunson,et al.  Variational Gaussian Copula Inference , 2015, AISTATS.

[20]  Lawrence Carin,et al.  Negative Binomial Process Count and Mixture Modeling , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Dustin Tran,et al.  Automatic Differentiation Variational Inference , 2016, J. Mach. Learn. Res..

[22]  Max Welling,et al.  Improved Variational Inference with Inverse Autoregressive Flow , 2016, NIPS 2016.

[23]  Ryan P. Adams,et al.  Variational Boosting: Iteratively Refining Posterior Approximations , 2016, ICML.

[24]  Sean Gerrish,et al.  Black Box Variational Inference , 2013, AISTATS.

[25]  Michael I. Jordan,et al.  Exploiting Tractable Substructures in Intractable Networks , 1995, NIPS.

[26]  Yoshua Bengio,et al.  NICE: Non-linear Independent Components Estimation , 2014, ICLR.

[27]  Shakir Mohamed,et al.  Learning in Implicit Generative Models , 2016, ArXiv.

[28]  Ole Winther,et al.  Auxiliary Deep Generative Models , 2016, ICML.

[29]  Michael I. Jordan,et al.  Bayesian parameter estimation via variational methods , 2000, Stat. Comput..

[30]  C. I. Bliss,et al.  FITTING THE NEGATIVE BINOMIAL DISTRIBUTION TO BIOLOGICAL DATA AND NOTE ON THE EFFICIENT FITTING OF THE NEGATIVE BINOMIAL , 1953 .

[31]  David Barber,et al.  An Auxiliary Variational Method , 2004, ICONIP.

[32]  T. Jaakkola,et al.  Improving the Mean Field Approximation Via the Use of Mixture Distributions , 1999, Learning in Graphical Models.

[33]  Ole Winther,et al.  Ladder Variational Autoencoders , 2016, NIPS.

[34]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[35]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[36]  David M. Blei,et al.  Nonparametric variational inference , 2012, ICML.

[37]  Jun Zhu,et al.  Kernel Implicit Variational Inference , 2017, ICLR.

[38]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[39]  Tim Salimans,et al.  Fixed-Form Variational Posterior Approximation through Stochastic Linear Regression , 2012, ArXiv.

[40]  Scott W. Linderman,et al.  Reparameterization Gradients through Acceptance-Rejection Sampling Algorithms , 2016, AISTATS.

[41]  Christopher M. Bishop,et al.  Variational Relevance Vector Machines , 2000, UAI.

[42]  Shakir Mohamed,et al.  Variational Inference with Normalizing Flows , 2015, ICML.

[43]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[44]  Eissa D. Habil Double Sequences and Double Series , 2016 .

[45]  Max Welling,et al.  Multiplicative Normalizing Flows for Variational Bayesian Neural Networks , 2017, ICML.

[46]  Yoshua Bengio,et al.  Denoising Criterion for Variational Auto-Encoding Framework , 2015, AAAI.

[47]  James G. Scott,et al.  Priors for Random Count Matrices Derived from a Family of Negative Binomial Processes , 2014, 1404.3331.

[48]  Dustin Tran,et al.  Variational Gaussian Process , 2015, ICLR.

[49]  Edoardo M. Airoldi,et al.  Copula variational inference , 2015, NIPS.

[50]  Ferenc Huszár,et al.  Variational Inference using Implicit Distributions , 2017, ArXiv.

[51]  Richard E. Turner,et al.  Gradient Estimators for Implicit Models , 2017, ICLR.

[52]  Miguel Lázaro-Gredilla,et al.  Doubly Stochastic Variational Bayes for non-Conjugate Inference , 2014, ICML.

[53]  Xiangyu Wang,et al.  Boosting Variational Inference , 2016, ArXiv.

[54]  Takafumi Kanamori,et al.  Density Ratio Estimation in Machine Learning , 2012 .

[55]  Jun Zhu,et al.  Implicit Variational Inference with Kernel Density Ratio Fitting , 2017, ArXiv.

[56]  Michael I. Jordan,et al.  Variational Bayesian Inference with Stochastic Search , 2012, ICML.

[57]  Dustin Tran,et al.  Hierarchical Implicit Models and Likelihood-Free Variational Inference , 2017, NIPS.

[58]  Yee Whye Teh,et al.  Tighter Variational Bounds are Not Necessarily Better , 2018, ICML.

[59]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[60]  Max Welling,et al.  Markov Chain Monte Carlo and Variational Inference: Bridging the Gap , 2014, ICML.

[61]  David M. Blei,et al.  Stochastic Structured Variational Inference , 2014, AISTATS.