S3: Evaluation of Tree-Pattern Queries Supported by Structural Summaries

XML queries are frequently based on path expressions where their elements are connected to each other in a tree-pattern structure, called query tree pattern (QTP). Therefore, a key operation in XML query processing is finding those elements which match the given QTP. In this paper, we propose a novel method, called S^3, which can selectively process the document's nodes. In S^3, unlike all previous methods, path expressions are not directly executed on the XML document, but first they are evaluated against a guidance structure, called QueryGuide. Enriched by information extracted from the QueryGuide, a query execution plan, called SMP, is generated to provide focused pattern matching and avoid document access as far as possible. Moreover, our experimental results confirm that S^3 and its optimized version OS^3 substantially outperform previous QTP processing methods w.r.t. response time, I/O overhead, and memory consumption - critical parameters in any real multi-user environment.

[1]  Marcus Fontoura,et al.  Optimizing cursor movement in holistic twig joins , 2005, CIKM '05.

[2]  Hua-Gang Li,et al.  Twig2Stack: bottom-up processing of generalized-tree-pattern queries over XML documents , 2006, VLDB.

[3]  Theo Härder,et al.  An efficient infrastructure for native transactional XML processing , 2007, Data Knowl. Eng..

[4]  Tok Wang Ling,et al.  Efficient processing of XML twig patterns with parent child edges: a look-ahead approach , 2004, CIKM '04.

[5]  Patrick E. O'Neil,et al.  ORDPATHs: insert-friendly XML node labels , 2004, SIGMOD '04.

[6]  Beng Chin Ooi,et al.  XR-tree: indexing XML data for efficient structural joins , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[7]  Tok Wang Ling,et al.  On boosting holism in XML twig pattern matching using structural indexing techniques , 2005, SIGMOD '05.

[8]  Ioana Manolescu,et al.  XMark: A Benchmark for XML Data Management , 2002, VLDB.

[9]  Tok Wang Ling,et al.  Prefix Path Streaming: A New Clustering Method for Optimal Holistic XML Twig Pattern Matching , 2004, DEXA.

[10]  Christian Mathis,et al.  Comparison of Complete and Elementless Native Storage of XML Documents , 2007, 11th International Database Engineering and Applications Symposium (IDEAS 2007).

[11]  Roy Goldman,et al.  DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases , 1997, VLDB.

[12]  Carlo Zaniolo,et al.  Efficient Structural Joins on Indexed XML Documents , 2002, VLDB.

[13]  Tok Wang Ling,et al.  From Region Encoding To Extended Dewey: On Efficient Processing of XML Twig Pattern Matching , 2005, VLDB.

[14]  Jeffrey Xu Yu,et al.  TwigList : Make Twig Pattern Matching Fast , 2007, DASFAA.

[15]  Erhard Rahm,et al.  Supporting Efficient Streaming and Insertion of XML Data in RDBMS , 2004, DIWeb.

[16]  Divesh Srivastava,et al.  Holistic twig joins: optimal XML pattern matching , 2002, SIGMOD '02.

[17]  Christian Mathis,et al.  Node labeling schemes for dynamic XML documents reconsidered , 2007, Data Knowl. Eng..

[18]  Jignesh M. Patel,et al.  Structural joins: a primitive for efficient XML query pattern matching , 2002, Proceedings 18th International Conference on Data Engineering.

[19]  Christian Mathis Extending a tuple-based XPath algebra to enhance evaluation flexibility , 2007, Informatik - Forschung und Entwicklung.

[20]  Hongjun Lu,et al.  Holistic Twig Joins on Indexed XML Documents , 2003, VLDB.