Rapidly Mixing Gibbs Sampling for a Class of Factor Graphs Using Hierarchy Width

Gibbs sampling on factor graphs is a widely used inference technique, which often produces good empirical results. Theoretical guarantees for its performance are weak: even for tree structured graphs, the mixing time of Gibbs may be exponential in the number of variables. To help understand the behavior of Gibbs sampling, we introduce a new (hyper)graph property, called hierarchy width. We show that under suitable conditions on the weights, bounded hierarchy width ensures polynomial mixing time. Our study of hierarchy width is in part motivated by a class of factor graph templates, hierarchical templates, which have bounded hierarchy width-regardless of the data used to instantiate them. We demonstrate a rich application from natural language processing in which Gibbs sampling provably mixes rapidly and achieves accuracy that exceeds human volunteers.

[1]  Paul D. Seymour,et al.  Graph Minors. II. Algorithmic Aspects of Tree-Width , 1986, J. Algorithms.

[2]  David Poole,et al.  First-order probabilistic inference , 2003, IJCAI.

[3]  John W. Fisher,et al.  Loopy Belief Propagation: Convergence and Effects of Message Errors , 2005, J. Mach. Learn. Res..

[4]  Max Welling,et al.  Distributed Inference for Latent Dirichlet Allocation , 2007, NIPS.

[5]  P. Diaconis,et al.  Gibbs sampling, exponential families and orthogonal polynomials , 2008, 0808.3852.

[6]  Pedro M. Domingos,et al.  Lifted First-Order Belief Propagation , 2008, AAAI.

[7]  Venkat Chandrasekaran,et al.  Complexity of Inference in Graphical Models , 2008, UAI.

[8]  Andrew Thomas,et al.  The BUGS project: Evolution, critique and future directions , 2009, Statistics in medicine.

[9]  John W. Lloyd,et al.  Probabilistic modelling, inference and learning using logical theories , 2008, Annals of Mathematics and Artificial Intelligence.

[10]  Andrew McCallum,et al.  FACTORIE: Probabilistic Programming via Imperatively Defined Factor Graphs , 2009, NIPS.

[11]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[12]  Alexander J. Smola,et al.  An architecture for parallel topic models , 2010, Proc. VLDB Endow..

[13]  P. Diaconis,et al.  Gibbs sampling, conjugate priors and coupling , 2010 .

[14]  D. Marx Tractable hypergraph properties for constraint satisfaction and conjunctive queries , 2010, STOC '10.

[15]  Johan Kwisthout,et al.  The Necessity of Bounded Treewidth for Efficient Inference in Bayesian Networks , 2010, ECAI.

[16]  Arthur Gretton,et al.  Parallel Gibbs Sampling: From Colored Fields to Thin Junction Trees , 2011, AISTATS.

[17]  F. Hollander Probability Theory : The Coupling Method , 2012 .

[18]  Jaroslav Nesetril,et al.  Sparsity - Graphs, Structures, and Algorithms , 2012, Algorithms and combinatorics.

[19]  Pedro M. Domingos,et al.  A Tractable First-Order Probabilistic Logic , 2012, AAAI.

[20]  Matthias Bethge,et al.  Training sparse natural image models with a fast Gibbs sampler of an extended state space , 2012, NIPS.

[21]  Vibhav Gogate,et al.  On Lifting the Gibbs Sampling Algorithm , 2012, StarAI@UAI.

[22]  V. Climenhaga Markov chains and mixing times , 2013 .

[23]  Dániel Marx,et al.  Tractable Hypergraph Properties for Constraint Satisfaction and Conjunctive Queries , 2009, JACM.

[24]  Pushmeet Kohli,et al.  Tractability: Practical Approaches to Hard Problems , 2013 .

[25]  Mihai Surdeanu Overview of the TAC2013 Knowledge Base Population Evaluation: English Slot Filling and Temporal Slot Filling , 2013, TAC.

[26]  Christopher Ré,et al.  DimmWitted: A Study of Main-Memory Statistical Analytics , 2014, Proc. VLDB Endow..

[27]  C. Ré,et al.  A Machine Reading System for Assembling Synthetic Paleontological Databases , 2014, PloS one.

[28]  Justin Domke,et al.  Projecting Markov Random Field Parameters for Fast Mixing , 2014, NIPS.

[29]  M. Surdeanu,et al.  Overview of the English Slot Filling Track at the TAC 2014 Knowledge Base Population Evaluation , 2014 .

[30]  Georg Gottlob,et al.  Treewidth and Hypertree Width , 2014, Tractability.

[31]  Prasoon Goyal,et al.  Probabilistic Databases , 2009, Encyclopedia of Database Systems.

[32]  Somdeb Sarkhel,et al.  Just Count the Satisfied Groundings: Scalable Local-Search and Sampling Based Inference in MLNs , 2015, AAAI.

[33]  Christopher De Sa,et al.  Incremental Knowledge Base Construction Using DeepDive , 2015, The VLDB Journal.