Oblivious bounds on the probability of boolean functions

This article develops upper and lower bounds for the probability of Boolean functions by treating multiple occurrences of variables as independent and assigning them new individual probabilities. We call this approach dissociation and give an exact characterization of optimal oblivious bounds, that is, when the new probabilities are chosen independently of the probabilities of all other variables. Our motivation comes from the weighted model counting problem (or, equivalently, the problem of computing the probability of a Boolean function), which is #P-hard in general. By performing several dissociations, one can transform a Boolean formula whose probability is difficult to compute into one whose probability is easy to compute, and which is guaranteed to provide an upper or lower bound on the probability of the original formula by choosing appropriate probabilities for the dissociated variables. Our new bounds shed light on the connection between previous relaxation-based and model-based approximations and unify them as concrete choices in a larger design space. We also show how our theory allows a standard relational database management system (DBMS) to both upper and lower bound hard probabilistic queries in guaranteed polynomial time.

[1]  Jesse Hoey,et al.  APRICODD: Approximate Policy Construction Using Decision Diagrams , 2000, NIPS.

[2]  Pedro M. Domingos,et al.  Structured Message Passing , 2013, UAI.

[3]  Dan Suciu,et al.  Bridging the gap between intensional and extensional query evaluation in probabilistic databases , 2010, EDBT '10.

[4]  Dan Suciu,et al.  Dissociation and Propagation for Efficient Query Evaluation over Probabilistic Databases , 2013, MUD.

[5]  Peter L. Hammer,et al.  Boolean Functions - Theory, Algorithms, and Applications , 2011, Encyclopedia of mathematics and its applications.

[6]  Hector Geffner,et al.  Structural Relaxations by Variable Renaming and Their Compilation for Solving MinCostSAT , 2007, CP.

[7]  Dan Roth,et al.  On the Hardness of Approximate Reasoning , 1993, IJCAI.

[8]  References , 1971 .

[9]  David Bergman,et al.  Discrete Optimization with Decision Diagrams , 2016, INFORMS J. Comput..

[10]  Gerhard Weikum,et al.  ACM Transactions on Database Systems , 2005 .

[11]  Dan Suciu,et al.  Dissociation and propagation for approximate lifted inference with standard relational database management systems , 2013, The VLDB Journal.

[12]  Peter J. Haas,et al.  MCDB: a monte carlo approach to managing uncertain data , 2008, SIGMOD Conference.

[13]  Dan Olteanu,et al.  Approximate confidence computation in probabilistic databases , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[14]  Luc De Raedt,et al.  Lifted Probabilistic Inference by First-Order Knowledge Compilation , 2011, IJCAI.

[15]  Leslie G. Valiant,et al.  The Complexity of Computing the Permanent , 1979, Theor. Comput. Sci..

[16]  Rina Dechter,et al.  Mini-buckets: A general scheme for bounded inference , 2003, JACM.

[17]  David Poole,et al.  Probabilistic Horn Abduction and Bayesian Networks , 1993, Artif. Intell..

[18]  David Bergman,et al.  Manipulating MDD Relaxations for Combinatorial Optimization , 2011, CPAIOR.

[19]  Adnan Darwiche,et al.  Node Splitting: A Scheme for Generating Upper Bounds in Bayesian Networks , 2007, UAI.

[20]  J. Scott Provan,et al.  The Complexity of Counting Cuts and of Computing the Probability that a Graph is Connected , 1983, SIAM J. Comput..

[21]  Adnan Darwiche,et al.  Relax, Compensate and Then Recover , 2010, JSAI-isAI Workshops.

[22]  John N. Hooker,et al.  A Constraint Store Based on Multivalued Decision Diagrams , 2007, CP.

[23]  David Bergman,et al.  Optimization Bounds from Binary Decision Diagrams , 2014, INFORMS J. Comput..

[24]  Dan Olteanu,et al.  Using OBDDs for Efficient Query Evaluation on Probabilistic Databases , 2008, SUM.

[25]  Leslie G. Valiant,et al.  A Scheme for Fast Parallel Communication , 1982, SIAM J. Comput..

[26]  Dan Suciu,et al.  Integrating and Ranking Uncertain Scientific Data , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[27]  Vibhav Gogate,et al.  SampleSearch: Importance sampling in presence of determinism , 2011, Artif. Intell..

[28]  Peter L. Hammer,et al.  Boolean Functions , 2013, Discrete Applied Mathematics.

[29]  Brendan J. Frey,et al.  A Revolution: Belief Propagation in Graphs with Cycles , 1997, NIPS.

[30]  Dan Suciu,et al.  Optimal Upper and Lower Bounds for Boolean Expressions by Dissociation , 2011, ArXiv.

[31]  Adnan Darwiche,et al.  On probabilistic inference by weighted model counting , 2008, Artif. Intell..

[32]  Dan Olteanu,et al.  On the optimal approximation of queries using tractable propositional languages , 2011, ICDT '11.

[33]  Pedro M. Domingos,et al.  Formula-Based Probabilistic Inference , 2010, UAI.

[34]  Pierre Marquis,et al.  A Knowledge Compilation Map , 2002, J. Artif. Intell. Res..

[35]  William Feller,et al.  An Introduction to Probability Theory and Its Applications , 1951 .

[36]  Dan Olteanu,et al.  SPROUT: Lazy vs. Eager Query Plans for Tuple-Independent Probabilistic Databases , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[37]  Adnan Darwiche,et al.  Clone: Solving Weighted Max-SAT in a Reduced Search Space , 2007, Australian Conference on Artificial Intelligence.

[38]  Norbert Fuhr,et al.  A probabilistic relational algebra for the integration of information retrieval and database systems , 1997, TOIS.

[39]  Dan Suciu,et al.  Efficient query evaluation on probabilistic databases , 2004, The VLDB Journal.

[40]  Lise Getoor,et al.  Read-once functions and query evaluation in probabilistic databases , 2010, Proc. VLDB Endow..

[41]  Adnan Darwiche,et al.  Relax then compensate: on max-product belief propagation and more , 2009, NIPS 2009.

[42]  Dan Suciu,et al.  Computing query probability with incidence algebras , 2010, PODS '10.

[43]  Christoph Koch,et al.  PIP: A database system for great and small expectations , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[44]  Christopher Ré,et al.  Efficient Top-k Query Evaluation on Probabilistic Data , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[45]  Pedro M. Domingos,et al.  Approximation by Quantization , 2011, UAI.