Creating non-minimal triangulations for use in inference in mixed stochastic/deterministic graphical models

We demonstrate that certain large-clique graph triangulations can be useful for reducing computational requirements when making queries on mixed stochastic/deterministic graphical models. This is counter to the conventional wisdom that triangulations that minimize clique size are always most desirable for use in computing queries on graphical models. Many of these large-clique triangulations are non-minimal and are thus unattainable via the popular elimination algorithm. We introduce ancestral pairs as the basis for novel triangulation heuristics and prove that no more than the addition of edges between ancestral pairs needs to be considered when searching for state space optimal triangulations in such graphs. Empirical results on random and real world graphs are given. We also present an algorithm and correctness proof for determining if a triangulation can be obtained via elimination, and we show that the decision problem associated with finding optimal state space triangulations in this mixed setting is NP-complete.

[1]  Michael I. Jordan,et al.  Triangulation by Continuous Embedding , 1996, NIPS.

[2]  Kristian G. Olesen,et al.  Maximal Prime Subgraph Decomposition of Bayesian Networks , 2001, FLAIRS.

[3]  Henry A. Kautz,et al.  Performing Bayesian Inference by Weighted Model Counting , 2005, AAAI.

[4]  Toniann Pitassi,et al.  Algorithms and complexity results for #SAT and Bayesian inference , 2003, 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings..

[5]  Steffen L. Lauritzen,et al.  Bayesian updating in causal probabilistic networks by local computations , 1990 .

[6]  Geoffrey Zweig,et al.  Speech Recognition with Dynamic Bayesian Networks , 1998, AAAI/IAAI.

[7]  Adnan Darwiche,et al.  Recursive conditioning , 2001, Artif. Intell..

[8]  Laurent Simon,et al.  Preface to the Special Volume on the SAT 2005 Competitions and Evaluations , 2006, J. Satisf. Boolean Model. Comput..

[9]  J. Bilmes,et al.  Elimination is Not Enough: Non-Minimal Triangulations for Graphical Models , 2004 .

[10]  Yang Xiang,et al.  Temporally Invariant Junction Tree for Inference in Dynamic Bayesian Network , 1998, Canadian Conference on AI.

[11]  Wilson X. Wen,et al.  Optimal decomposition of belief networks , 1990, UAI.

[12]  David Allen,et al.  New Advances in Inference by Recursive Conditioning , 2002, UAI.

[13]  M. Golumbic Algorithmic graph theory and perfect graphs , 1980 .

[14]  Fanica Gavril,et al.  Algorithms for Minimum Coloring, Maximum Clique, Minimum Covering by Cliques, and Maximum Independent Set of a Chordal Graph , 1972, SIAM J. Comput..

[15]  Donald W. Loveland,et al.  A machine program for theorem-proving , 2011, CACM.

[16]  Rina Dechter,et al.  Bayesian Inference in the Presence of Determinism , 2003, AISTATS.

[17]  Robert E. Tarjan,et al.  Algorithmic Aspects of Vertex Elimination on Graphs , 1976, SIAM J. Comput..

[18]  Hilary Putnam,et al.  A Computing Procedure for Quantification Theory , 1960, JACM.

[19]  Andreas Parra,et al.  How to Use the Minimal Separators of a Graph for its Chordal Triangulation , 1995, ICALP.

[20]  Sharad Malik,et al.  The Quest for Efficient Boolean Satisfiability Solvers , 2002, CAV.

[21]  Rina Dechter,et al.  Constraint Processing , 1995, Lecture Notes in Computer Science.

[22]  Linda C. van der Gaag,et al.  Pre-processing for Triangulation of Probabilistic Networks , 2001, UAI.

[23]  Van Nostrand,et al.  Error Bounds for Convolutional Codes and an Asymptotically Optimum Decoding Algorithm , 1967 .

[24]  James D. Park,et al.  MAP Complexity Results and Approximation Methods , 2002, UAI.

[25]  Stuart J. Russell,et al.  Dynamic bayesian networks: representation, inference and learning , 2002 .

[26]  Jeff A. Bilmes,et al.  Non-Minimal Triangulations for Mixed Stochastic/Deterministic Graphical Models , 2006, UAI.

[27]  Kevin Duh,et al.  Genetic triangulation of graphical models for speech and language processing , 2005, INTERSPEECH.

[28]  Rina Dechter,et al.  Topological parameters for time-space tradeoff , 1996, Artif. Intell..

[29]  Frank Jensen,et al.  Approximations in Bayesian Belief Universe for Knowledge Based Systems , 2013, UAI 1990.

[30]  Robert E. Tarjan,et al.  Simple Linear-Time Algorithms to Test Chordality of Graphs, Test Acyclicity of Hypergraphs, and Selectively Reduce Acyclic Hypergraphs , 1984, SIAM J. Comput..

[31]  Umberto Bertelè,et al.  Nonserial Dynamic Programming , 1972 .

[32]  Michael Wooldridge,et al.  Artificial Intelligence Today: Recent Trends and Developments , 1999 .

[33]  Jirí Vomlel,et al.  Exploiting Functional Dependence in Bayesian Network Inference , 2002, UAI.

[34]  Derek G. Corneil,et al.  Complexity of finding embeddings in a k -tree , 1987 .

[35]  Karim Filali,et al.  Multi-dynamic Bayesian Networks , 2006, NIPS.

[36]  Craig Boutilier,et al.  Context-Specific Independence in Bayesian Networks , 1996, UAI.

[37]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[38]  Frank Jensen,et al.  Optimal junction Trees , 1994, UAI.

[39]  T. Ohtsuki,et al.  Minimal triangulation of a graph and optimal pivoting order in a sparse matrix , 1976 .

[40]  Adnan Darwiche,et al.  Compiling Bayesian Networks Using Variable Elimination , 2007, IJCAI.

[41]  Robert J. McEliece,et al.  The generalized distributive law , 2000, IEEE Trans. Inf. Theory.

[42]  Uffe Kjaerulff,et al.  A computational scheme for Reasoning in Dynamic Probabilistic Networks , 2013, 1303.5407.

[43]  Adnan Darwiche,et al.  On probabilistic inference by weighted model counting , 2008, Artif. Intell..

[44]  Jeff A. Bilmes,et al.  DBN-based multi-stream models for Mandarin toneme recognition , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[45]  Rina Dechter,et al.  AND/OR search spaces for graphical models , 2007, Artif. Intell..

[46]  D. Rose Triangulated graphs and the elimination process , 1970 .

[47]  Denise Draper,et al.  Clustering Without (Thinking About) Triangulation , 1995, UAI.

[48]  Edward P. K. Tsang,et al.  Foundations of constraint satisfaction , 1993, Computation in cognitive science.

[49]  Pedro Larrañaga,et al.  Decomposing Bayesian networks: triangulation of the moral graph with genetic algorithms , 1997, Stat. Comput..

[50]  Karim Filali,et al.  A Dynamic Bayesian Framework to Model Context and Memory in Edit Distance Learning: An Application to Pronunciation Classification , 2005, ACL.

[51]  M. Yannakakis Computing the Minimum Fill-in is NP^Complete , 1981 .

[52]  Jeff A. Bilmes,et al.  DBN based multi-stream models for speech , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[53]  Stephen A. Cook,et al.  The complexity of theorem-proving procedures , 1971, STOC.

[54]  Rina Dechter,et al.  Bucket elimination: A unifying framework for probabilistic inference , 1996, UAI.

[55]  S. Parter The Use of Linear Graphs in Gauss Elimination , 1961 .

[56]  Rina Dechter,et al.  Hybrid Processing of Beliefs and Constraints , 2001, UAI.

[57]  Dan Geiger,et al.  Optimizing exact genetic linkage computations , 2003, RECOMB '03.

[58]  David J. Spiegelhalter,et al.  Local computations with probabilities on graphical structures and their application to expert systems , 1990 .

[59]  Toniann Pitassi,et al.  Value Elimination: Bayesian Interence via Backtracking Search , 2002, UAI.

[60]  Keiji Kanazawa,et al.  A model for reasoning about persistence and causation , 1989 .

[61]  Michael Luby,et al.  An Optimal Approximation Algorithm for Bayesian Inference , 1997, Artif. Intell..

[62]  Rina Dechter,et al.  Enhancement Schemes for Constraint Processing: Backjumping, Learning, and Cutset Decomposition , 1990, Artif. Intell..

[63]  D. Nilsson,et al.  An efficient algorithm for finding the M most probable configurationsin probabilistic expert systems , 1998, Stat. Comput..

[64]  Steffen L. Lauritzen,et al.  Graphical models in R , 1996 .

[65]  Prakash P. Shenoy,et al.  Propagating Belief Functions with Local Computations , 1986, IEEE Expert.

[66]  Robin Milner,et al.  On Observing Nondeterminism and Concurrency , 1980, ICALP.

[67]  Jeff A. Bilmes,et al.  DBN based multi-stream models for audio-visual speech recognition , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[68]  Jeff A. Bilmes,et al.  On Triangulating Dynamic Graphical Models , 2002, UAI.

[69]  Rina Dechter,et al.  Mixtures of Deterministic-Probabilistic Networks and their AND/OR Search Space , 2004, UAI.

[70]  Jeff A. Bilmes,et al.  Graphical models for large vocabulary speech recognition , 2008 .