The chase revisited

We revisit the standard chase procedure, studying its properties and applicability to classical database problems. We settle (in the negative) the open problem of decidability of termination of the standard chase, and we provide sufficient termination conditions which are strictly less over-conservative than the best previously known. We investigate the adequacy of the standard chase for checking query containment under constraints, constraint implication and computing certain answers in data exchange, gaining a deeper understanding by separating the algorithm from its result. We identify the properties of the chase result that are essential to the above applications, and we introduce the more general notion of F-universal model set, which supports query and constraint languages that are closed under a class F of mappings. By choosing F appropriately, we extend prior results to existential first-order queries and ∀∃-firstorder constraints. We show that the standard chase is incomplete for finding universal model sets, and we introduce the extended core chase which is complete, i.e. finds an F-universal model set when it exists. A key advantage of the new chase is that the same algorithm can be applied for all mapping classes F of interest, simply by modifying the set of constraints given as input. Even when restricted to the typical input in prior work, the new chase supports certain answer computation and containment/implication tests in strictly more cases than the incomplete standard chase.

[1]  W. C. Hilles,et al.  Data exchange. , 1976, Journal of medical education.

[2]  Alfred V. Aho,et al.  The theory of joins in relational data bases , 1977, 18th Annual Symposium on Foundations of Computer Science (sfcs 1977).

[3]  Ashok K. Chandra,et al.  Optimal implementation of conjunctive queries in relational data bases , 1977, STOC '77.

[4]  David Maier,et al.  Testing implications of data dependencies , 1979, SIGMOD '79.

[5]  Mihalis Yannakakis,et al.  On the Complexity of Testing Implications of Functional and Join Dependencies , 1981, JACM.

[6]  Ronald Fagin,et al.  Horn clauses and database dependencies , 1982, JACM.

[7]  Catriel Beeri,et al.  A Proof Procedure for Data Dependencies , 1984, JACM.

[8]  Jaroslav Nesetril,et al.  The core of a graph , 1992, Discret. Math..

[9]  Serge Abiteboul,et al.  Foundations of Databases , 1994 .

[10]  Serge Abiteboul,et al.  Complexity of answering queries using materialized views , 1998, PODS.

[11]  Maurizio Lenzerini,et al.  Data integration: a theoretical perspective , 2002, PODS.

[12]  Andrea Calì,et al.  Query rewriting and answering under constraints in data integration systems , 2003, IJCAI.

[13]  Data exchange: getting to the core , 2003, PODS '03.

[14]  Alin Deutsch,et al.  Reformulation of XML Queries and Constraints , 2003, ICDT.

[15]  Alin Deutsch,et al.  MARS: A System for Publishing XML from Mixed and Redundant Storage , 2003, VLDB.

[16]  Moshe Y. Vardi Inferring multivalued dependencies from functional and join dependencies , 2004, Acta Informatica.

[17]  Cong Yu,et al.  Constraint-based XML query rewriting for data integration , 2004, SIGMOD '04.

[18]  Phokion G. Kolaitis,et al.  Peer data exchange , 2005, PODS '05.

[19]  Ronald Fagin,et al.  Data exchange: semantics and query answering , 2003, Theor. Comput. Sci..

[20]  Benjamin Rossman,et al.  Existential positive types and preservation under homomorphisms , 2005, 20th Annual IEEE Symposium on Logic in Computer Science (LICS' 05).

[21]  Phokion G. Kolaitis,et al.  The complexity of data exchange , 2006, PODS '06.

[22]  Georg Gottlob,et al.  Data exchange: computing cores in polynomial time , 2006, PODS '06.

[23]  Alin Deutsch,et al.  Rewriting queries using views with access patterns under integrity constraints , 2005, Theor. Comput. Sci..