Tree Automata Based Efficient XPath Evaluation over XML Data Stream

How to efficiently evaluate massive XPaths set over an XML stream is a fundamental problem in applications of the data stream. The current methods can not fully support the commonly used features of XPath, or can not meet the space and time requirement of the data stream applications. In this paper, a new tree automata based machine, XEBT, is proposed to solve the problem. Different from traditional ones, XEBT has the following features: First, it is based on tree automata with a powerful expressiveness, which can support Xpath {()} without extra states or intermediate results; Second, XEBT supports many optimization strategies, including DTD based XPath tree automata construction, partial determination to reduce the concurrent states at running time with limited extra space costs, and the combination of bottom-up and top-down evaluation. Experimental results show that XEBT supports the complex Xpath and outperforms the former work in both efficiency and space cost.

[1]  Dan Suciu,et al.  Stream processing of XPath queries with predicates , 2003, SIGMOD '03.

[2]  Steven J. DeRose,et al.  XML Path Language (XPath) , 1999 .

[3]  David J. DeWitt,et al.  NiagaraCQ: a scalable continuous query system for Internet databases , 2000, SIGMOD '00.

[4]  Dan Suciu,et al.  Containment and equivalence for an XPath fragment , 2002, PODS.

[5]  Serge Abiteboul,et al.  Monitoring XML data on the Web , 2001, SIGMOD '01.

[6]  Rajeev Rastogi,et al.  Efficient filtering of XML documents with XPath expressions , 2002, The VLDB Journal.

[7]  Frank Neven,et al.  Automata, Logic, and XML , 2002, CSL.

[8]  Yanlei Diao,et al.  YFilter: efficient and scalable filtering of XML documents , 2002, Proceedings 18th International Conference on Data Engineering.

[9]  Dan Suciu,et al.  Typechecking for XML transformers , 2000, J. Comput. Syst. Sci..