Complex Twig Pattern Query Processing over XML Streams

The problem of processing streaming XML data is gaining widespread attention from the research community. In this paper, a novel approach for processing complex Twig Pattern with OR-predicates and AND-predicates over XML documents stream is presented. For the improvement of the processing performance of Twig Patterns, all the Twig Patterns are combined into a single prefix query tree that represents such queries by sharing their common prefixes. Its OR-predicates and AND-predicates of a node are represented as a separate abstract syntax tree associated with the node. Consequently, all the Twig Patterns are evaluated in a single, document-order pass over the input document stream for avoiding the interim results produced by the post-processing nested paths of YFilter. Compared with the existing approach, experimental results show that it can significantly improve the performance for matching complex Twig Patterns over XML document stream, especially for large size XML documents. Based on the prior works, the optimization of twig patters under DTD (document type definition) by using structural and constraint information of DTD is also addressed, which is static, namely, it is processed before the runtime of stream processing.

[1]  Dan Suciu,et al.  Stream processing of XPath queries with predicates , 2003, SIGMOD '03.

[2]  Boris Chidlovskii,et al.  Using regular tree automata as XML schemas , 2000, Proceedings IEEE Advances in Digital Libraries 2000.

[3]  Elke A. Rundensteiner,et al.  Semantic Query Optimization for XQuery over XML Streams , 2005, VLDB.

[4]  Luis Gravano,et al.  Navigation- vs. index-based XML multi-query processing , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[5]  François Bry,et al.  An evaluation of regular path expressions with qualifiers against XML streams , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[6]  Dan Suciu,et al.  View Selection for Stream Processing , 2002, WebDB.

[7]  Bertram Ludäscher,et al.  A Transducer-Based XML Query Processor , 2002, VLDB.

[8]  Hao Zhang,et al.  Path sharing and predicate evaluation for high-performance XML filtering , 2003, TODS.

[9]  Tang Shiwei,et al.  Tree Automata Based Efficient XPath Evaluation over XML Data Stream , 2005 .

[10]  Hongjun Lu,et al.  Efficient Processing of Twig Queries with OR-Predicates. , 2004, ACM SIGMOD Conference.

[11]  Takashi Honishi,et al.  Distributed XML stream filtering system with high scalability , 2005, 21st International Conference on Data Engineering (ICDE'05).

[12]  Peter T. Wood Minimising Simple XPath Expressions , 2001, WebDB.

[13]  Michael J. Franklin,et al.  Efficient Filtering of XML Documents for Selective Dissemination of Information , 2000, VLDB.

[14]  Dan Suciu,et al.  Processing XML streams with deterministic automata and stream indexes , 2004, TODS.

[15]  Aoying Zhou,et al.  Bloom filter-based XML packets filtering for millions of path queries , 2005, 21st International Conference on Data Engineering (ICDE'05).

[16]  Joonho Kwon,et al.  FiST: Scalable XML Document Filtering by Sequencing Twig Patterns , 2005, VLDB.