Continuous Relaxations for Discrete Hamiltonian Monte Carlo

Continuous relaxations play an important role in discrete optimization, but have not seen much use in approximate probabilistic inference. Here we show that a general form of the Gaussian Integral Trick makes it possible to transform a wide class of discrete variable undirected models into fully continuous systems. The continuous representation allows the use of gradient-based Hamiltonian Monte Carlo for inference, results in new ways of estimating normalization constants (partition functions), and in general opens up a number of new avenues for inference in difficult discrete systems. We demonstrate some of these continuous relaxation inference algorithms on a number of illustrative problems.

[1]  Ben Calderhead,et al.  Riemannian Manifold Hamiltonian Monte Carlo , 2009, 0907.1100.

[2]  鈴木 増雄 Time-Dependent Statistics of the Ising Model , 1965 .

[3]  Andrew McCallum,et al.  Collective Segmentation and Labeling of Distant Entities in Information Extraction , 2004 .

[4]  Ilya Sutskever,et al.  Parallelizable Sampling of Markov Random Fields , 2010, AISTATS.

[5]  Christopher D. Manning,et al.  Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[6]  Sebastian Nowozin,et al.  Structured Prediction and Learning in Computer Vision , 2011 .

[7]  De Wilde P Class of Hamiltonian neural networks. , 1993, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[8]  Bart Selman,et al.  Accelerated Adaptive Markov Chain for Partition Function Computation , 2011, NIPS.

[9]  Ruslan Salakhutdinov,et al.  Evaluating probabilities under high-dimensional latent variable models , 2008, NIPS.

[10]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[11]  Sebastian Nowozin,et al.  Structured Learning and Prediction in Computer Vision , 2011, Found. Trends Comput. Graph. Vis..

[12]  Andrew McCallum,et al.  An Introduction to Conditional Random Fields , 2010, Found. Trends Mach. Learn..

[13]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[14]  Thomas L. Griffiths,et al.  Bayesian Inference for PCFGs via Markov Chain Monte Carlo , 2007, NAACL.

[15]  Geoffrey E. Hinton,et al.  Exponential Family Harmoniums with an Application to Information Retrieval , 2004, NIPS.

[16]  Ulrich Ramacher,et al.  The Hamiltonian approach to neural networks dynamics , 1991, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks.

[17]  J. Hubbard Calculation of Partition Functions , 1959 .

[18]  Radford M. Neal MCMC Using Hamiltonian Dynamics , 2011, 1206.1901.

[19]  Ryan P. Adams,et al.  Elliptical slice sampling , 2009, AISTATS.

[20]  R. Vilela Mendes,et al.  Vector Fields and Neural Networks , 1992, Complex Syst..

[21]  Jeffrey C. Lagarias,et al.  Minimax and Hamiltonian Dynamics of Excitatory-Inhibitory Networks , 1997, NIPS.

[22]  Andrew McCallum,et al.  Information Extraction with HMMs and Shrinkage , 1999 .

[23]  J. Møller,et al.  An efficient Markov chain Monte Carlo method for distributions with intractable normalising constants , 2006 .

[24]  Anthony C. C. Coolen,et al.  Modern Analytic Techniques to Solve the Dynamics of Recuurent Neural Networks , 1995, NIPS.

[25]  Anders Krogh,et al.  Introduction to the theory of neural computation , 1994, The advanced book program.