Efficient Evaluation of XML Path Queries with Automata

Path query is one of the most frequently used components by the various XML query languages. Most of the proposed methods compute path queries in instance space, i.e. directly facing the XML instances, such as XML tree traversal and containment join ways. As a query method based on automata technique, automata match (AM) can evaluate path expression queries in schema space so that it allows efficient computation of complex queries on vast amount of data. This paper introduces how to construct query automata in order to compute all regular expression queries including those with wildcards. Furthermore, a data structure named schema automata is proposed to evaluate containment queries that are very difficult from the conventional automata point of view. To improve the efficiency of schema automata, methods to reduce and persistent them are proposed. Finally, performance study of the proposed methods are given.

[1]  Ralph Arnote,et al.  Hong Kong (China) , 1996, OECD/G20 Base Erosion and Profit Shifting Project.

[2]  Alin Deutsch,et al.  A Query Language for XML , 1999, Comput. Networks.

[3]  Jennifer Widom,et al.  Query Optimization for XML , 1999, VLDB.

[4]  Ioana Manolescu,et al.  The XML benchmark project , 2001 .

[5]  Michael J. Franklin,et al.  Efficient Filtering of XML Documents for Selective Dissemination of Information , 2000, VLDB.

[6]  Jianhua Lv,et al.  XBase: Making your gigabyte disk files queriable , 2002, SIGMOD 2002.

[7]  Ge Yu,et al.  A New Path Expression Computing Approach for XML Data , 2002, EEXTT.

[8]  Guido Moerkotte,et al.  Querying documents in object databases , 1997, International Journal on Digital Libraries.

[9]  Jignesh M. Patel,et al.  Structural joins: a primitive for efficient XML query pattern matching , 2002, Proceedings 18th International Conference on Data Engineering.

[10]  Jennifer Widom,et al.  The Lorel query language for semistructured data , 1997, International Journal on Digital Libraries.

[11]  Gottfried Vossen,et al.  The World Wide Web and Databases , 2001, Lecture Notes in Computer Science.

[12]  Alin Deutsch,et al.  XML-QL: A Query Language for XML , 1998 .

[13]  Peter Fankhauser XQuery formal semantics state and challenges , 2001, SGMD.

[14]  Carlo Zaniolo,et al.  Efficient Structural Joins on Indexed XML Documents , 2002, VLDB.

[15]  Ge Yu,et al.  XBase: making your gigabyte disk queriable , 2002, SIGMOD '02.

[16]  Daniela Florescu,et al.  Quilt: An XML Query Language for Heterogeneous Data Sources , 2000, WebDB.