Hamiltonian Annealed Importance Sampling for partition function estimation

We introduce an extension to annealed importance sampling that uses Hamiltonian dynamics to rapidly estimate normalization constants. We demonstrate this method by computing log likelihoods in directed and undirected probabilistic image models. We compare the performance of linear generative models with both Gaussian and Laplace priors, product of experts models with Laplace and Student's t experts, the mc-RBM, and a bilinear generative model. We provide code to compare additional models.

[1]  H. Kahn,et al.  Methods of Reducing Sample Size in Monte Carlo Computations , 1953, Oper. Res..

[2]  David J. Field,et al.  Sparse coding with an overcomplete basis set: A strategy employed by V1? , 1997, Vision Research.

[3]  C. Jarzynski Equilibrium free-energy differences from nonequilibrium measurements: A master-equation approach , 1997, cond-mat/9707325.

[4]  J. V. van Hateren,et al.  Independent component filters of natural images compared with simple cells in primary visual cortex , 1998, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[5]  Radford M. Neal Annealed importance sampling , 1998, Stat. Comput..

[6]  P. Moral,et al.  Sequential Monte Carlo samplers , 2002, cond-mat/0212648.

[7]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[8]  Thomas P. Minka,et al.  Divergence measures and message passing , 2005 .

[9]  M. Bethge Factorial coding of natural images: how effective are linear models in removing higher-order dependencies? , 2006, Journal of the Optical Society of America. A, Optics, image science, and vision.

[10]  D. Field,et al.  Estimates of the information content and dimensionality of natural scenes from proximity distributions. , 2007, Journal of the Optical Society of America. A, Optics, image science, and vision.

[11]  Michael S. Lewicki,et al.  Hierarchical statistical models of computation in the visual cortex , 2007 .

[12]  Ruslan Salakhutdinov,et al.  On the quantitative analysis of deep belief networks , 2008, ICML '08.

[13]  Ruslan Salakhutdinov,et al.  Evaluating probabilities under high-dimensional latent variable models , 2008, NIPS.

[14]  Yair Weiss,et al.  The 'tree-dependent components' of natural scenes are edge filters , 2009, NIPS.

[15]  Geoffrey E. Hinton,et al.  Modeling pixel means and covariances using factorized third-order boltzmann machines , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[16]  Andrew Gelman,et al.  Handbook of Markov Chain Monte Carlo , 2011 .

[17]  Jascha Sohl-Dickstein,et al.  Minimum Probability Flow Learning , 2009, ICML.

[18]  Radford M. Neal MCMC Using Hamiltonian Dynamics , 2011, 1206.1901.

[19]  W. Bialek,et al.  Statistical thermodynamics of natural images. , 2008, Physical review letters.