Kappa-Join: Efficient Execution of Existential Quantification in XML Query Languages

XML query languages feature powerful primitives for formulating queries, involving comparison expressions which are existentially quantified. If such comparisons involve several scopes, they are correlated and, thus, become difficult to evaluate efficiently. In this paper, we develop a new ternary operator, called Kappa-Join, for efficiently evaluating queries with existential quantification. In XML queries, a correlation predicate can occur conjunctively and disjunctively. Our decorrelation approach not only improves performance in the conjunctive case, but also allows decorrelation of the disjunctive case. The latter is not possible with any known technique. In an experimental evaluation, we compare the query execution times of the Kappa-Join with existing XPath evaluation techniques to demonstrate the effectiveness of our new operator.

[1]  Matthias Jarke,et al.  Query Optimization in Database Systems , 1984, CSUR.

[2]  Pavel Parízek,et al.  Unnesting SQL Queries in the Presence of Disjunction , 2006 .

[3]  Harry K. T. Wong,et al.  Optimization of nested SQL queries revisited , 1987, SIGMOD '87.

[4]  JarkeMatthias,et al.  Query Optimization in Database Systems , 1984 .

[5]  Laks V. S. Lakshmanan,et al.  TAX: A Tree Algebra for XML , 2001, DBPL.

[6]  Antonio Albano,et al.  Yet another query algebra for XML data , 2002, Proceedings International Database Engineering and Applications Symposium.

[7]  Norman May,et al.  Nested queries and quantifiers in an ordered context , 2004, Proceedings. 20th International Conference on Data Engineering.

[8]  Peter Boncz,et al.  Pathfinder: relational XQuery over multi-gigabyte XML inputs in interactive time , 2005 .

[9]  Norman May,et al.  Unnesting Scalar SQL Queries in the Presence of Disjunction , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[10]  Georg Gottlob,et al.  XPath query evaluation: improving time and space efficiency , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[11]  Catriel Beeri,et al.  SAL: An Algebra for Semistructured Data and XML , 1999, WebDB.

[12]  Thomas Neumann,et al.  Efficient generation and execution of DAG-structured query graphs , 2005 .

[13]  Umeshwar Dayal,et al.  Of Nests and Trees: A Unified Approach to Processing Queries That Contain Nested Subqueries, Aggregates, and Quantifiers , 1987, VLDB.

[14]  François Bry,et al.  Towards an efficient evaluation of general queries: quantifier and disjunction processing revisited , 1989, SIGMOD '89.

[15]  Goetz Graefe,et al.  Query evaluation techniques for large databases , 1993, CSUR.

[16]  Hamid Pirahesh,et al.  Complex query decorrelation , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[17]  Sven Helmer,et al.  Anatomy of a native XML base management system , 2002, The VLDB Journal.

[18]  Jan Hidders,et al.  Avoiding Unnecessary Ordering Operations in XPath , 2003, DBPL.

[19]  Sven Helmer,et al.  Algebraic Optimization of Nested XPath Expressions , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[20]  César A. Galindo-Legaria,et al.  Orthogonal optimization of subqueries and aggregation , 2001, SIGMOD '01.

[21]  Christopher Ré,et al.  A Complete and Efficient Algebraic Compiler for XQuery , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[22]  Sven Helmer,et al.  Full-fledged algebraic XPath processing in Natix , 2005, 21st International Conference on Data Engineering (ICDE'05).

[23]  Guido Moerkotte,et al.  Optimization and Evaluation of Disjunctive Queries , 2000, IEEE Trans. Knowl. Data Eng..

[24]  Tim Furche,et al.  XPath: Looking Forward , 2002, EDBT Workshops.

[25]  Prasan Roy Optimization of DAG-Structured Query Evaluation Plans , 1998, Encyclopedia of Database Systems.

[26]  Denilson Barbosa,et al.  ToXgene: a template-based data generator for XML , 2002, SIGMOD '02.

[27]  Werner Kiessling SQL-like and Quel-like correlation queries with aggregates revis-ited , 1984 .

[28]  Won Kim,et al.  On optimizing an SQL-like nested query , 1982, TODS.