Signature-based Filtering Techniques for Structural Joins of XML Data

Queries on XML documents typically combine selections on element contents, and, via path expressions, the structural relationships between tagged elements. Efficient support for structural joins is thus the key to efficient implementation of XML queries. With a stack to keep ancestordescendant structural relationships, stack-tree join algorithm enhances the performance of structural joins by reducing deducible unnecessary comparisons. However, stack-tree join cannot prevent "unwanted" comparisons between elements that do no participate in the join. To solve this problem, we propose a signature filter, which takes advantage of encoding schemes proposed for XML and occupies a little space. Then we present an pointer-based signature filter to skip the "unwanted" elements. In order to further improve the filtering efficiency, we finally propose an optimized pointer-based filter with the conjunction of two signatures. Performance study shows that our signaturebased filters have excellent filtering performance and significantly improve the performance of structural joins.

[1]  David J. DeWitt,et al.  An Evaluation of Non-Equijoin Algorithms , 1991, VLDB.

[2]  Ge Yu,et al.  Efficient Evaluation of XML Path Queries with Automata , 2003, WAIM.

[3]  Jignesh M. Patel,et al.  Structural joins: a primitive for efficient XML query pattern matching , 2002, Proceedings 18th International Conference on Data Engineering.

[4]  Quanzhong Li,et al.  Indexing and Querying XML Data for Regular Path Expressions , 2001, VLDB.

[5]  Ge Yu,et al.  Applying signature filtering technique to join algorithms , 1999, Proceedings. Tenth International Workshop on Database and Expert Systems Applications. DEXA 99.

[6]  S. Boag,et al.  XQuery 1.0 : An XML query language, W3C Working Draft 12 November 2003 , 2003 .

[7]  David J. DeWitt,et al.  On supporting containment queries in relational database management systems , 2001, SIGMOD '01.

[8]  Beng Chin Ooi,et al.  XR-tree: indexing XML data for efficient structural joins , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[9]  C. M. Sperberg-McQueen,et al.  eXtensible Markup Language (XML) 1.0 (Second Edition) , 2000 .

[10]  Dik L. LeeDepartment Signature File Methods for Indexing Object-oriented Database Systems , 1992 .

[11]  Goetz Graefe,et al.  Query evaluation techniques for large databases , 1993, CSUR.

[12]  Hiroyuki Kitagawa,et al.  Evaluation of signature files as set access facilities in OODBs , 1993, SIGMOD '93.

[13]  Carlo Zaniolo,et al.  Efficient Structural Joins on Indexed XML Documents , 2002, VLDB.

[14]  Steven J. DeRose,et al.  XML Path Language (XPath) Version 1.0 , 1999 .