Intracluster Moves for Constrained Discrete-Space MCMC

This paper addresses the problem of sampling from binary distributions with constraints. In particular, it proposes an MCMC method to draw samples from a distribution of the set of all states at a specified distance from some reference state. For example, when the reference state is the vector of zeros, the algorithm can draw samples from a binary distribution with a constraint on the number of active variables, say the number of 1's. We motivate the need for this algorithm with examples from statistical physics and probabilistic inference. Unlike previous algorithms proposed to sample from binary distributions with these constraints, the new algorithm allows for large moves in state space and tends to propose them such that they are energetically favourable. The algorithm is demonstrated on three Boltzmann machines of varying difficulty: A ferromagnetic Ising model (with positive potentials), a restricted Boltzmann machine with learned Gabor-like filters as potentials, and a challenging three-dimensional spin-glass (with positive and negative potentials).

[1]  Nando de Freitas,et al.  Learning about Individuals from Group Statistics , 2005, UAI.

[2]  Jun S. Liu,et al.  Bayesian Clustering with Variable and Transformation Selections , 2003 .

[3]  Gerard T. Barkema,et al.  Monte Carlo Methods in Statistical Physics , 1999 .

[4]  K. Kawasaki Diffusion Constants near the Critical Point for Time-Dependent Ising Models. I , 1966 .

[5]  Mark Jerrum,et al.  The Swendsen-Wang process does not always mix rapidly , 1997, STOC '97.

[6]  Ruslan Salakhutdinov,et al.  On the quantitative analysis of deep belief networks , 2008, ICML '08.

[7]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[8]  Giorgio Parisi,et al.  Numerical Simulations of Spin Glass Systems , 1997 .

[9]  Nando de Freitas,et al.  Hot Coupling: A Particle Approach to Inference and Normalization on Pairwise Undirected Graphs , 2005, NIPS.

[10]  Kotagiri Ramamohanarao,et al.  Sparse Bayesian Learning for Regression and Classification using Markov Chain Monte Carlo , 2002, ICML.

[11]  Nando de Freitas,et al.  From Fields to Trees , 2004, UAI.

[12]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[13]  Wang,et al.  Nonuniversal critical dynamics in Monte Carlo simulations. , 1987, Physical review letters.

[14]  J. Hammersley,et al.  Monte Carlo Methods , 1965 .

[15]  Paul Smolensky,et al.  Information processing in dynamical systems: foundations of harmony theory , 1986 .

[16]  Hilbert J. Kappen Deterministic learning rules for boltzmann machines , 1995, Neural Networks.

[17]  Christian P. Robert,et al.  Monte Carlo Statistical Methods (Springer Texts in Statistics) , 2005 .

[18]  Geoffrey E. Hinton,et al.  A Learning Algorithm for Boltzmann Machines , 1985, Cogn. Sci..