Iterative Refinement of Approximate Posterior for Training Directed Belief Networks

Recent advances in variational inference that make use of an inference or recognition network for training and evaluating deep directed graphical models have advanced well beyond traditional variational inference and Markov chain Monte Carlo methods. These techniques offer higher flexibility with simpler and faster inference; yet training and evaluation still remains a challenge. We propose a method for improving the per-example approximate posterior by iterative refinement, which can provide notable gains in maximizing the variational lower bound of the log likelihood and works with both continuous and discrete latent variables. We evaluate our approach as a method of training and evaluating directed graphical models. We show that, when used for training, iterative refinement improves the variational lower bound and can also improve the log-likelihood over related methods. We also show that iterative refinement can be used to get a better estimate of the log-likelihood in any directed model trained with mean-field inference.

[1]  Geoffrey E. Hinton,et al.  The "wake-sleep" algorithm for unsupervised neural networks. , 1995, Science.

[2]  Chong Wang,et al.  Stochastic variational inference , 2012, J. Mach. Learn. Res..

[3]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[4]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[5]  Frank D. Wood,et al.  Inference Networks for Sequential Monte Carlo in Graphical Models , 2016, ICML.

[6]  Daan Wierstra,et al.  Deep AutoRegressive Networks , 2013, ICML.

[7]  Geoffrey E. Hinton,et al.  A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants , 1998, Learning in Graphical Models.

[8]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[9]  Hugo Larochelle,et al.  Efficient Learning of Deep Boltzmann Machines , 2010, AISTATS.

[10]  Ruslan Salakhutdinov,et al.  Importance Weighted Autoencoders , 2015, ICLR.

[11]  Geoffrey E. Hinton,et al.  The Helmholtz Machine , 1995, Neural Computation.

[12]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[13]  Yoshua Bengio,et al.  Bidirectional Helmholtz Machines , 2015, ICML.

[14]  Shakir Mohamed,et al.  Variational Inference with Normalizing Flows , 2015, ICML.

[15]  Max Welling,et al.  Markov Chain Monte Carlo and Variational Inference: Bridging the Gap , 2014, ICML.

[16]  Man-Suk Oh,et al.  Adaptive importance sampling in monte carlo integration , 1992 .

[17]  Michael I. Jordan,et al.  Mean Field Theory for Sigmoid Belief Networks , 1996, J. Artif. Intell. Res..

[18]  Yoshua Bengio,et al.  Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation , 2013, ArXiv.

[19]  N. Gordon,et al.  Novel approach to nonlinear/non-Gaussian Bayesian state estimation , 1993 .

[20]  Vince D. Calhoun,et al.  Restricted Boltzmann machines for neuroimaging: An application in identifying intrinsic networks , 2014, NeuroImage.

[21]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[22]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[23]  Parul Parashar,et al.  Neural Networks in Machine Learning , 2014 .

[24]  Juha Karhunen,et al.  Approximate Riemannian Conjugate Gradient Learning for Fixed-Form Variational Bayes , 2010, J. Mach. Learn. Res..

[25]  Karol Gregor,et al.  Neural Variational Inference and Learning in Belief Networks , 2014, ICML.

[26]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[27]  Sergey Levine,et al.  MuProp: Unbiased Backpropagation for Stochastic Neural Networks , 2015, ICLR.

[28]  Tapani Raiko,et al.  Techniques for Learning Binary Stochastic Feedforward Neural Networks , 2014, ICLR.

[29]  Nando de Freitas,et al.  An Introduction to Sequential Monte Carlo Methods , 2001, Sequential Monte Carlo Methods in Practice.

[30]  Radford M. Neal Connectionist Learning of Belief Networks , 1992, Artif. Intell..

[31]  Xinyun Chen Under Review as a Conference Paper at Iclr 2017 Delving into Transferable Adversarial Ex- Amples and Black-box Attacks , 2016 .

[32]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[33]  Yoshua Bengio,et al.  Reweighted Wake-Sleep , 2014, ICLR.

[34]  Ruslan Salakhutdinov,et al.  On the quantitative analysis of deep belief networks , 2008, ICML '08.

[35]  Geoffrey E. Hinton,et al.  Deep Boltzmann Machines , 2009, AISTATS.

[36]  Andriy Mnih,et al.  Variational Inference for Monte Carlo Objectives , 2016, ICML.

[37]  Ruslan Salakhutdinov,et al.  Learning Stochastic Feedforward Neural Networks , 2013, NIPS.

[38]  Richard E. Turner,et al.  Neural Adaptive Sequential Monte Carlo , 2015, NIPS.