Uniform Reliability of Self-Join-Free Conjunctive Queries

The reliability of a Boolean Conjunctive Query (CQ) over a tuple-independent probabilistic database is the probability that the CQ is satisfied when the tuples of the database are sampled one by one, independently, with their associated probability. For queries without self-joins (repeated relation symbols), the data complexity of this problem is fully characterized in a known dichotomy: reliability can be computed in polynomial time for hierarchical queries, and is #P-hard for non-hierarchical queries. Hierarchical queries also characterize the tractability of queries for other tasks: having read-once lineage formulas, supporting insertion/deletion updates to the database in constant time, and having a tractable computation of tuples' Shapley and Banzhaf values. In this work, we investigate a fundamental counting problem for CQs without self-joins: how many sets of facts from the input database satisfy the query? This is a simpler, uniform variant of the query reliability problem, where the probability of every tuple is required to be 1/2. Of course, for hierarchical queries, uniform reliability is in polynomial time, like the reliability problem. However, it is an open question whether being hierarchical is necessary for the uniform reliability problem to be in polynomial time. In fact, the complexity of the problem has been unknown even for the simplest non-hierarchical CQs without self-joins. We solve this open question by showing that uniform reliability is #P-complete for every non-hierarchical CQ without self-joins. Hence, we establish that being hierarchical also characterizes the tractability of unweighted counting of the satisfying tuple subsets. We also consider the generalization to query reliability where all tuples of the same relation have the same probability, and give preliminary results on the complexity of this problem.

[1]  Supratik Chakraborty,et al.  From Weighted to Unweighted Model Counting , 2015, IJCAI.

[2]  Yuri Gurevich,et al.  The complexity of query reliability , 1998, PODS.

[3]  Dan Suciu,et al.  Probabilistic Databases with MarkoViews , 2012, Proc. VLDB Endow..

[4]  Dan Suciu,et al.  SlimShot: In-Database Probabilistic Inference for Knowledge Bases , 2016, Proc. VLDB Endow..

[5]  Andrei A. Bulatov,et al.  The complexity of the counting constraint satisfaction problem , 2008, JACM.

[6]  Jef Wijsen,et al.  Counting Database Repairs that Satisfy Conjunctive Queries with Self-Joins , 2019, ICDT.

[7]  L. Shapley,et al.  The Shapley Value , 1994 .

[8]  Dan Suciu,et al.  The dichotomy of probabilistic inference for unions of conjunctive queries , 2012, JACM.

[9]  S. R. Searle,et al.  On the history of the kronecker product , 1983 .

[10]  Nicole Schweikardt,et al.  Answering Conjunctive Queries under Updates , 2017, PODS.

[11]  Guy Van den Broeck,et al.  Quantifying Causal Effects on Query Answering in Databases , 2016, TaPP.

[12]  Dan Olteanu,et al.  Using OBDDs for Efficient Query Evaluation on Probabilistic Databases , 2008, SUM.

[13]  Dan Suciu,et al.  Efficient query evaluation on probabilistic databases , 2004, The VLDB Journal.

[14]  L. Shapley,et al.  Stochastic Games* , 1953, Proceedings of the National Academy of Sciences.

[15]  Jef Wijsen,et al.  A dichotomy in the complexity of counting database repairs , 2013, J. Comput. Syst. Sci..

[16]  Dan Suciu,et al.  A Dichotomy for the Generalized Model Counting Problem for Unions of Conjunctive Queries , 2020, PODS.

[17]  Leopoldo E. Bertossi,et al.  The Shapley Value of Tuples in Query Answering , 2019, ICDT.

[18]  J. Scott Provan,et al.  The Complexity of Counting Cuts and of Computing the Probability that a Graph is Connected , 1983, SIAM J. Comput..

[19]  Pradeep Dubey,et al.  Mathematical Properties of the Banzhaf Power Index , 1979, Math. Oper. Res..

[20]  B. Grofman,et al.  Iannucci and Its Aftermath: The Application of the Banzhaf Index to Weighted Voting in the State of New York , 1979 .

[21]  Guy Van den Broeck,et al.  Symmetric Weighted First-Order Model Counting , 2014, PODS.

[22]  M. Tamer Özsu Synthesis Lectures on Data Management , 2010 .

[23]  Antoine Amarilli,et al.  The Dichotomy of Evaluating Homomorphism-Closed Queries on Probabilistic Graphs , 2019, Log. Methods Comput. Sci..

[24]  Christopher Ré,et al.  Probabilistic databases: diamonds in the dirt , 2009, CACM.