Index vs. Navigation in XPath Evaluation

A well-known rule of thumb claims, it is better to scan than to use an index when more than 10% of the data are accessed. This rule was formulated for relational databases. But is it still valid for XML queries? In this paper we develop similar rules of thumb for XML queries by experimentally comparing different execution strategies, e.g. using navigation or indices. These rules can be used immediately for heuristic optimization of XML queries, and in the long run, they may serve as a foundation for cost-based query optimization in XQuery.

[1]  Sebastian Maneth,et al.  Efficient Memory Representation of XML Documents , 2005, DBPL.

[2]  Alin Deutsch,et al.  Storing semistructured data with STORED , 1999, SIGMOD '99.

[3]  Torsten Grust,et al.  Staircase Join: Teach a Relational DBMS to Watch its (Axis) Steps , 2003, VLDB.

[4]  Eljas Soisalon-Soininen,et al.  Concurrency control and recovery for balanced B-link trees , 2005, The VLDB Journal.

[5]  Edith Cohen,et al.  Labeling dynamic XML trees , 2002, PODS '02.

[6]  Sven Helmer,et al.  Optimized translation of XPath into algebraic expressions parameterized by programs containing navigational primitives , 2002, Proceedings of the Third International Conference on Web Information Systems Engineering, 2002. WISE 2002..

[7]  Norman May,et al.  XQuery Processing in Natix with an Emphasis on Join Ordering , 2004, XIME-P.

[8]  Hamid Pirahesh,et al.  System RX: one part relational, one part XML , 2005, SIGMOD '05.

[9]  Guy M. Lohman,et al.  Measuring the Complexity of Join Enumeration in Query Optimization , 1990, VLDB.

[10]  Jan Hidders,et al.  Avoiding Unnecessary Ordering Operations in XPath , 2003, DBPL.

[11]  Patrick E. O'Neil,et al.  ORDPATHs: insert-friendly XML node labels , 2004, SIGMOD '04.

[12]  Stavros Christodoulakis,et al.  On the propagation of errors in the size of join results , 1991, SIGMOD '91.

[13]  Jérôme Siméon,et al.  Put a Tree Pattern in Your Algebra , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[14]  David J. DeWitt,et al.  Mixed Mode XML Query Processing , 2003, VLDB.

[15]  Shankar Pal,et al.  XQuery Implementation in a Relational Database System , 2005, VLDB.

[16]  Martin L. Kersten,et al.  The Complexity of Transformation-Based Join Enumeration , 1997, VLDB.

[17]  Neoklis Polyzotis,et al.  XCluster Synopses for Structured XML Content , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[18]  David J. DeWitt,et al.  The design and performance evaluation of alternative XML storage strategies , 2002, SGMD.

[19]  Christopher Ré,et al.  A Complete and Efficient Algebraic Compiler for XQuery , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[20]  M. Tamer Özsu,et al.  XSEED: Accurate and Fast Cardinality Estimation for XPath Queries , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[21]  Cong Yu,et al.  TIMBER: A native XML database , 2002, The VLDB Journal.

[22]  Jignesh M. Patel,et al.  Structural joins: a primitive for efficient XML query pattern matching , 2002, Proceedings 18th International Conference on Data Engineering.

[23]  Sven Helmer,et al.  Full-fledged algebraic XPath processing in Natix , 2005, 21st International Conference on Data Engineering (ICDE'05).

[24]  Torsten. Grust,et al.  Accelerating XPath location steps , 2002, SIGMOD '02.

[25]  Jignesh M. Patel,et al.  Structural join order selection for XML query optimization , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[26]  Goetz Graefe,et al.  Query evaluation techniques for large databases , 1993, CSUR.

[27]  Goetz Graefe,et al.  The five-minute rule ten years later, and other computer storage rules of thumb , 1997, SGMD.

[28]  Sven Helmer,et al.  Anatomy of a native XML base management system , 2002, The VLDB Journal.

[29]  Quanzhong Li,et al.  Indexing and Querying XML Data for Regular Path Expressions , 2001, VLDB.

[30]  Roy Goldman,et al.  DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases , 1997, VLDB.

[31]  Carlo Zaniolo,et al.  Efficient Structural Joins on Indexed XML Documents , 2002, VLDB.