Predicate-based Filtering of XPath Expressions

The XML/XPath filtering problem has found wide-spread interest. In this paper, we propose a novel algorithm for solving it. Our approach encodes XPath expressions (XPEs) as ordered sets of predicates and translates XML documents into sets of tuples, which are evaluated over these predicates. Predicates representing overlapping portions of XPEs are stored and processed once, thus fully exploiting potential overlap in XPEs. We experimentally evaluate the performance of our algorithm, demonstrating its scalability to millions of XPEs, with matching performance in the millisecond range. We show interesting trade-offs to alternative approaches.

[1]  Dan Suciu,et al.  Stream processing of XPath queries with predicates , 2003, SIGMOD '03.

[2]  Laks V. S. Lakshmanan,et al.  On Efficient Matching of Streaming XML Documents and Queries , 2002, EDBT.

[3]  Jussi Myllymaki,et al.  Implementing a scalable XML publish/subscribe system using relational database systems , 2004, SIGMOD '04.

[4]  Rina Dechter,et al.  Backjump-based backtracking for constraint satisfaction problems , 2002, Artif. Intell..

[5]  Rajeev Rastogi,et al.  Efficient filtering of XML documents with XPath expressions , 2002, The VLDB Journal.

[6]  Roger Barga,et al.  Proceedings of the 22nd International Conference on Data Engineering Workshops, ICDE 2006, 3-7 April 2006, Atlanta, GA, USA , 2006, ICDE Workshops.

[7]  Hamid Pirahesh,et al.  Efficiently publishing relational data as XML documents , 2001, The VLDB Journal.

[8]  Luis Gravano,et al.  Navigation- vs. index-based XML multi-query processing , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[9]  Hao Zhang,et al.  Path sharing and predicate evaluation for high-performance XML filtering , 2003, TODS.

[10]  Ahmad Ashari,et al.  Storing And Querying XML Data Using RDBMS , 2004, iiWAS.

[11]  R. Dechter,et al.  Backtracking Algorithms for Constraint Satisfaction Problems , 1999 .

[12]  Susan B. Davidson,et al.  BLAS: an efficient XPath processing system , 2004, SIGMOD '04.

[13]  Yanlei Diao,et al.  YFilter: efficient and scalable filtering of XML documents , 2002, Proceedings 18th International Conference on Data Engineering.

[14]  Dennis Shasha,et al.  Filtering algorithms and implementation for very fast publish/subscribe systems , 2001, SIGMOD '01.

[15]  Dan Suciu,et al.  Processing XML Streams with Deterministic Automata , 2003, ICDT.

[16]  Michael J. Franklin,et al.  Efficient Filtering of XML Documents for Selective Dissemination of Information , 2000, VLDB.

[17]  Edward M. Reingold,et al.  Backtrack programming techniques , 1975, CACM.