Containment of Conjunctive Queries on Annotated Relations

We study containment and equivalence of (unions of) conjunctive queries on relations annotated with elements of a commutative semiring. Such relations and the semantics of positive relational queries on them were introduced in a recent paper as a generalization of set semantics, bag semantics, incomplete databases, and databases annotated with various kinds of provenance information. We obtain positive decidability results and complexity characterizations for databases with lineage, why-provenance, and provenance polynomial annotations, for both conjunctive queries and unions of conjunctive queries. At least one of these results is surprising given that provenance polynomial annotations seem “more expressive” than bag semantics and under the latter, containment of unions of conjunctive queries is known to be undecidable. The decision procedures rely on interesting variations on the notion of containment mappings. We also show that for any positive semiring (a very large class) and conjunctive queries without self-joins, equivalence is the same as isomorphism.

[1]  Jacobo Torán,et al.  The graph isomorphism problem , 2020, Commun. ACM.

[2]  Wang Chiew Tan Containment of Relational Queries with Annotation Propagation , 2003, DBPL.

[3]  Val Tannen,et al.  Annotated XML: queries and provenance , 2008, PODS.

[4]  Val Tannen,et al.  Provenance semirings , 2007, PODS.

[5]  Ashok K. Chandra,et al.  Optimal implementation of conjunctive queries in relational data bases , 1977, STOC '77.

[6]  Tomasz Imielinski,et al.  Incomplete Information in Relational Databases , 1984, JACM.

[7]  Sara Cohen,et al.  Equivalence of queries combining set and bag-set semantics , 2006, PODS '06.

[8]  J. Köbler,et al.  The Graph Isomorphism Problem: Its Structural Complexity , 1993 .

[9]  Surajit Chaudhuri,et al.  On the equivalence of recursive and nonrecursive datalog programs , 1992, J. Comput. Syst. Sci..

[10]  Surajit Chaudhuri,et al.  Optimization of real conjunctive queries , 1993, PODS '93.

[11]  Phokion G. Kolaitis,et al.  The containment problem for Real conjunctive queries with inequalities , 2006, PODS '06.

[12]  Oded Shmueli,et al.  Equivalence of DATALOG Queries is Undecidable , 1993, J. Log. Program..

[13]  Laks V. S. Lakshmanan,et al.  A Parametric Approach to Deductive Databases with Uncertainty , 2001, IEEE Trans. Knowl. Data Eng..

[14]  Esteban Zimányi,et al.  Query Evaluation in Probabilistic Relational Databases , 1997, Theor. Comput. Sci..

[15]  L. Lovász Operations with structures , 1967 .

[16]  Jennifer Widom,et al.  Exploiting Lineage for Confidence Computation in Uncertain and Probabilistic Databases , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[17]  Jaroslav Nesetril,et al.  Graphs and homomorphisms , 2004, Oxford lecture series in mathematics and its applications.

[18]  Serge Abiteboul,et al.  On the complexity of managing probabilistic XML data , 2007, PODS '07.

[19]  Val Tannen,et al.  Models for Incomplete and Probabilistic Information , 2006, IEEE Data Eng. Bull..

[20]  Jennifer Widom,et al.  Tracing the lineage of view data in a warehousing environment , 2000, TODS.

[21]  Todd J. Green Containment of conjunctive queries on annotated relations , 2009, ICDT.

[22]  Val Tannen,et al.  Update Exchange with Mappings and Provenance , 2007, VLDB.

[23]  Sanjeev Khanna,et al.  Edinburgh Research Explorer On the Propagation of Deletions and Annotations through Views , 2013 .

[24]  Werner Nutt,et al.  Equivalences among aggregate queries with negation , 2005, TOCL.

[25]  Raghu Ramakrishnan,et al.  Containment of conjunctive queries: beyond relations as sets , 1995, TODS.

[26]  Mihalis Yannakakis,et al.  Equivalences Among Relational Expressions with the Union and Difference Operators , 1980, J. ACM.

[27]  Werner Nutt,et al.  Rewriting aggregate queries using views , 1999, PODS.

[28]  Sara Cohen Containment of aggregate queries , 2005, SGMD.

[29]  Werner Nutt,et al.  Deciding equivalences among aggregate queries , 1998, PODS '98.

[30]  Stefano Bistarelli Semirings for Soft Constraint Solving and Programming , 2004, Lecture Notes in Computer Science.

[31]  Dan Olteanu,et al.  From complete to incomplete information and back , 2007, SIGMOD '07.

[32]  Norbert Fuhr,et al.  A probabilistic relational algebra for the integration of information retrieval and database systems , 1997, TOIS.

[33]  James Cheney,et al.  Curated databases , 2008, PODS.

[34]  Sanjeev Khanna,et al.  Why and Where: A Characterization of Data Provenance , 2001, ICDT.