Efficient Processing of XPath Queries Using Indexes

A number of query languages have been proposed in recent times for processing queries on XML and semistructured data. All these query languages make use of regular path expressions to query XML data. To optimize the processing of query paths a number of indexing schemes have also been proposed recently. XPath provides the basis for processing queries on XML data in the form of regular path expressions. In this paper, we propose two algorithms called Entry-point algorithm and Rest-tree algorithm that exploit different types of indexes, which we have defined to efficiently process XPath queries. We also discuss and compare two variations in implementing these algorithms; Root-first and Bottom-first.

[1]  Divesh Srivastava,et al.  Holistic twig joins: optimal XML pattern matching , 2002, SIGMOD '02.

[2]  Jeffrey F. Naughton,et al.  Covering indexes for branching path queries , 2002, SIGMOD '02.

[3]  Jennifer Widom,et al.  The Lorel query language for semistructured data , 1997, International Journal on Digital Libraries.

[4]  Alberto O. Mendelzon,et al.  Indexing XML Data with ToXin , 2001, WebDB.

[5]  Daniela Florescu,et al.  Quilt: An XML Query Language for Heterogeneous Data Sources , 2000, WebDB.

[6]  Dan Suciu,et al.  Index Structures for Path Expressions , 1999, ICDT.

[7]  Guido Moerkotte,et al.  Querying documents in object databases , 1997, International Journal on Digital Libraries.

[8]  S. Boag,et al.  XQuery 1.0 : An XML query language, W3C Working Draft 12 November 2003 , 2003 .

[9]  Alin Deutsch,et al.  A Query Language for XML , 1999, Comput. Networks.

[10]  François Role,et al.  Le Document Object Model (DOM) , 1999 .

[11]  Jennifer Widom,et al.  Indexing Semistructured Data , 1998 .

[12]  Roy Goldman,et al.  DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases , 1997, VLDB.

[13]  Andrew Lim,et al.  D(k)-index: an adaptive structural summary for graph-structured data , 2003, SIGMOD '03.

[14]  Julius T. Tou,et al.  Information Systems , 1973, GI Jahrestagung.

[15]  Arvind Malhotra,et al.  Xml schema part 2: datatypes , 1999 .

[16]  Quanzhong Li,et al.  Indexing and Querying XML Data for Regular Path Expressions , 2001, VLDB.

[17]  Letizia Tanca,et al.  XML-GL: A Graphical Language for Querying and Restructuring XML Documents , 1999, SEBD.

[18]  Pavel Zezula,et al.  YAPI: Yet Another Path Index for XML Searching , 2003, ECDL.

[19]  Roy Goldman,et al.  Lore: a database management system for semistructured data , 1997, SGMD.

[20]  Philip S. Yu,et al.  ViST: a dynamic index method for querying XML data by tree structures , 2003, SIGMOD '03.

[21]  Michael J. Franklin,et al.  A Fast Index for Semistructured Data , 2001, VLDB.

[22]  Jignesh M. Patel,et al.  Structural joins: a primitive for efficient XML query pattern matching , 2002, Proceedings 18th International Conference on Data Engineering.

[23]  Ehud Gudes,et al.  Exploiting local similarity for indexing paths in graph-structured data , 2002, Proceedings 18th International Conference on Data Engineering.