Variance Reduction in Black-box Variational Inference by Adaptive Importance Sampling

Overdispersed black-box variational inference employs importance sampling to reduce the variance of the Monte Carlo gradient in black-box variational inference. A simple overdispersed proposal distribution is used. This paper aims to investigate how to adaptively obtain better proposal distribution for lower variance. To this end, we directly approximate the optimal proposal in theory using a Monte Carlo moment matching step at each variational iteration. We call this adaptive proposal moment matching proposal (MMP). Experimental results on two Bayesian models show that the MMP can effectively reduce variance in black-box learning, and perform better than baseline inference algorithms.

[1]  David A. Knowles Stochastic gradient variational Bayes for gamma approximating distributions , 2015, 1509.01631.

[2]  Yee Whye Teh,et al.  The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.

[3]  Ruslan Salakhutdinov,et al.  Importance Weighted Autoencoders , 2015, ICLR.

[4]  Scott W. Linderman,et al.  Reparameterization Gradients through Acceptance-Rejection Sampling Algorithms , 2016, AISTATS.

[5]  Andriy Mnih,et al.  Variational Inference for Monte Carlo Objectives , 2016, ICML.

[6]  Peter D. Hoff A EXPONENTIAL FAMILIES , 2013 .

[7]  Thomas P. Minka,et al.  Divergence measures and message passing , 2005 .

[8]  Michael I. Jordan,et al.  Variational Bayesian Inference with Stochastic Search , 2012, ICML.

[9]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[10]  Horngren Datar Rajan,et al.  3RD EDITION , 2008 .

[11]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[12]  I JordanMichael,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008 .

[13]  Michael I. Jordan,et al.  A Variational Approach to Bayesian Logistic Regression Models and their Extensions , 1997, AISTATS.

[14]  Bruce D'Ambrosio,et al.  Proceedings of the Eighth international conference on Uncertainty in artificial intelligence , 1992 .

[15]  Kenneth Steiglitz,et al.  Combinatorial Optimization: Algorithms and Complexity , 1981 .

[16]  Alexander J. Smola,et al.  Neural Information Processing Systems , 1997, NIPS 1997.

[17]  Tom Minka,et al.  Expectation Propagation for approximate Bayesian inference , 2001, UAI.

[18]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[19]  David M. Blei,et al.  The Generalized Reparameterization Gradient , 2016, NIPS.

[20]  Verzekeren Naar Sparen,et al.  Cambridge , 1969, Humphrey Burton: In My Own Time.

[21]  Sean Gerrish,et al.  Black Box Variational Inference , 2013, AISTATS.

[22]  Antti Honkela,et al.  Unsupervised Variational Bayesian Learning of Nonlinear Models , 2004, NIPS.

[23]  Zoubin Ghahramani,et al.  Propagation Algorithms for Variational Bayesian Learning , 2000, NIPS.