On the tractability of query compilation and bounded treewidth

We consider the problem of computing the probability of a Boolean function, which generalizes the model counting problem. Given an OBDD for such a function, its probability can be computed in linear time in the size of the OBDD. In this paper we investigate the connection between treewidth and the size of the OBDD. Bounded treewidth has proven to be applicable to many graph problems, which are NP-hard in general but become tractable on graphs with bounded treewidth. However, it is less well understood how bounded treewidth can be used for the probability computation problem of a Boolean function. We introduce a new notion of treewidth of a Boolean function, called the expression treewidth, as the smallest treewidth of any DAG-expression representing the function. Our new notion of bounded treewidth includes some previously known tractable cases: all read-once Boolean functions, and all functions having a bounded treewidth of the primal graph or of the incidence graph also have a bounded expression treewidth. We show that bounded expression treewidth implies the existence of a polynomial size OBDD, and that bounded expression pathwidth implies the existence of a constant-width OBDD. We also show a converse of the latter result: constant-width OBDD imply bounded expression pathwidth. We then study the implications of these results to query compilation, where the Boolean function is the lineage of a fixed query on varying input databases. We give a syntactic characterizations of all UCQ≠ queries that admit a polynomial size OBDD, showing that these are precisely inversion-free queries with unrestricted use of ≠. It was previously known that inversion-free queries characterize precisely those UCQ queries that have a polynomial size OBDD, and that these also have a constant width OBDD: in contrast, inversion-free queries with ≠ have polynomial-width OBDD, thus using the full power of OBDD. Finally, we show that in the case of UCQ, the four classes studied in this paper collapse: bounded expression pathwidth, bounded expression treewidth, constant-width OBDD, and polynomial size OBDD.

[1]  Johann A. Makowsky,et al.  Counting truth assignments of formulas of bounded tree-width or clique-width , 2008, Discret. Appl. Math..

[2]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[3]  Phokion G. Kolaitis,et al.  Conjunctive-Query Containment and Constraint Satisfaction , 2000, J. Comput. Syst. Sci..

[4]  Lise Getoor,et al.  Read-once functions and query evaluation in probabilistic databases , 2010, Proc. VLDB Endow..

[5]  Val Tannen,et al.  Faster query answering in probabilistic databases using read-once functions , 2010, ICDT '11.

[6]  Christopher Ré,et al.  Probabilistic databases , 2011, SIGA.

[7]  Dan Olteanu,et al.  Secondary-storage confidence computation for conjunctive queries with inequalities , 2009, SIGMOD Conference.

[8]  Randal E. Bryant,et al.  Symbolic Manipulation of Boolean Functions Using a Graphical Representation , 1985, 22nd ACM/IEEE Design Automation Conference.

[9]  Leslie G. Valiant,et al.  The Complexity of Enumeration and Reliability Problems , 1979, SIAM J. Comput..

[10]  Udi Rotics,et al.  Factoring and recognition of read-once functions using cographs and normality and the readability of functions associated with partial k-trees , 2006, Discret. Appl. Math..

[11]  Moshe Y. Vardi,et al.  Treewidth in Verification: Local vs. Global , 2005, LPAR.

[12]  Lowell W. Beineke,et al.  The number of labeled k-dimensional trees , 1969 .

[13]  Dan Suciu,et al.  Computing query probability with incidence algebras , 2010, PODS '10.

[14]  Bruno Courcelle,et al.  On the fixed parameter complexity of graph enumeration problems definable in monadic second-order logic , 2001, Discret. Appl. Math..

[15]  Adnan Darwiche,et al.  Modeling and Reasoning with Bayesian Networks , 2009 .

[16]  Derek G. Corneil,et al.  Complexity of finding embeddings in a k -tree , 1987 .

[17]  Georg Gottlob,et al.  Bounded treewidth as a key to tractability of knowledge representation and reasoning , 2006, Artif. Intell..

[18]  Arie M. C. A. Koster,et al.  Combinatorial Optimization on Graphs of Bounded Treewidth , 2008, Comput. J..

[19]  Hans L. Bodlaender,et al.  A linear time algorithm for finding tree-decompositions of small treewidth , 1993, STOC.

[20]  Dan Olteanu,et al.  Using OBDDs for Efficient Query Evaluation on Probabilistic Databases , 2008, SUM.

[21]  Adnan Darwiche,et al.  Using DPLL for Efficient OBDD Construction , 2004, SAT.

[22]  Dan Suciu,et al.  Bridging the gap between intensional and extensional query evaluation in probabilistic databases , 2010, EDBT '10.

[23]  Yang Xiang,et al.  Book Review: A. Darwiche, Modeling and Reasoning with Bayesian Networks , 2009 .

[24]  Michael I. Jordan,et al.  Probabilistic Networks and Expert Systems , 1999 .

[25]  Ingo Wegener,et al.  BDDs--design, analysis, complexity, and applications , 2004, Discret. Appl. Math..

[26]  Val Tannen,et al.  Provenance semirings , 2007, PODS.