Adaptive blocked Gibbs sampling for inference in probabilistic graphical models

Inference is a central problem in probabilistic graphical models, and is often the main sub-step in probabilistic learning procedures. Thus, accurate inference algorithms are essential to both answer queries on a learned model, as well as to learn a robust model. Gibbs sampling is arguably one of the most popular approximate inference methods that has been widely used for probabilistic inference in several different domains including natural language processing, computer vision. etc. Here, we develop an approach that improves the performance of blocked Gibbs sampling, an advanced variant of the Gibbs sampling algorithm. Specifically, we utilize correlation among variables in the probabilistic graphical model to develop an adaptive blocked Gibbs sampler that automatically tunes its proposal distribution based on statistics derived from previous samples. Specifically, we adapt the proposal such that we sample blocks containing highly correlated variables more often than the others. This in turn helps improve probability estimates given by the sampler, by selecting hard-to-sample variables more often during the sampling procedure. Further, since adaptation breaks the Markovian property of the sampler, we develop a method to guarantee that our sampler converges to the correct stationary distribution despite being non-Markovian, by diminishing the adaptation of the selection probabilities over time. We evaluate our method with several discrete probabilistic graphical models taken from UAI challenge problems corresponding to different domains, and show that our approach is superior in terms of accuracy as compared to methods that ignore correlation information in the proposal distribution of the sampler.

[1]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[2]  Richard Szeliski,et al.  A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms , 2001, International Journal of Computer Vision.

[3]  Andrew McCallum,et al.  Conditional Models of Identity Uncertainty with Application to Noun Coreference , 2004, NIPS.

[4]  Arthur Gretton,et al.  Parallel Gibbs Sampling: From Colored Fields to Thin Junction Trees , 2011, AISTATS.

[5]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[6]  Gareth O. Roberts,et al.  Examples of Adaptive MCMC , 2009 .

[7]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[8]  Vibhav Gogate,et al.  Dynamic Blocking and Collapsing for Gibbs Sampling , 2013, UAI.

[9]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[10]  Rina Dechter,et al.  Bucket Elimination: A Unifying Framework for Reasoning , 1999, Artif. Intell..

[11]  Richard A. Levine,et al.  Optimizing random scan Gibbs samplers , 2006 .

[12]  Dan Roth,et al.  On the Hardness of Approximate Reasoning , 1993, IJCAI.

[13]  Uffe Kjærulff,et al.  Blocking Gibbs sampling in very large probabilistic expert systems , 1995, Int. J. Hum. Comput. Stud..

[14]  Hilbert J. Kappen,et al.  Inference in the Promedas Medical Expert System , 2007, AIME.

[15]  Jeffrey S. Rosenthal,et al.  Coupling and Ergodicity of Adaptive MCMC , 2007 .

[16]  G. Casella,et al.  Rao-Blackwellisation of sampling schemes , 1996 .

[17]  Adnan Darwiche,et al.  Modeling and Reasoning with Bayesian Networks , 2009 .

[18]  Donald Geman,et al.  Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images , 1984 .

[19]  J. Rosenthal,et al.  Adaptive Gibbs samplers and related MCMC methods , 2011, 1101.5838.

[20]  P. Diaconis,et al.  Gibbs sampling, exponential families and orthogonal polynomials , 2008, 0808.3852.

[21]  Dan Geiger,et al.  Optimizing exact genetic linkage computations , 2003, RECOMB '03.