Mixing Bottom-Up and Top-Down XPath Query Evaluation

Available XPath evaluators basically follow one of two strategies to evaluate an XPath query on hierarchical XML data: either they evaluate it top-down or they evaluate it bottom-up. In this paper, we present an approach that allows evaluating an XPath query in arbitrary directions, including a mixture of bottom-up and top-down direction. For each location step, it can be decided whether to evaluate it top-down or bottom-up, such that we can start e.g. with a location step of low selectivity and evaluate all child-axis steps top-down at the same time. As our experiments have shown, this approach allows for a very efficient XPath evaluation which is 15 times faster than the JDK1.6 XPath query evaluation (JAXP) and which is several times faster than MonetDB if the file size is ≤ 30 MB or the query to be evaluated contains at least one location step that has a low selectivity. Furthermore, our approach is applicable to most compressed XML formats too, which may prevent swapping when a large XML document does not fit into main memory but its compressed representation does.

[1]  Dan Suciu,et al.  Stream processing of XPath queries with predicates , 2003, SIGMOD '03.

[2]  Marcus Fontoura,et al.  Streaming XPath processing with forward and backward axes , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[3]  Ioana Manolescu,et al.  XMark: A Benchmark for XML Data Management , 2002, VLDB.

[4]  Dan Suciu,et al.  Processing XML streams with deterministic automata and stream indexes , 2004, TODS.

[5]  Stefan Böttcher,et al.  BSBC: Towards a Succinct Data Format for XML Streams , 2008, WEBIST.

[6]  R. Watson,et al.  Data Management , 1980, Bone Marrow Transplantation.

[7]  L. Nelson Data, data everywhere. , 1997, Critical Care Medicine.

[8]  Torsten Grust,et al.  MonetDB/XQuery: a fast XQuery processor powered by a relational engine , 2006, SIGMOD Conference.

[9]  Gonzalo Navarro,et al.  Fast in-memory XPath search using compressed indexes , 2010, ICDE.

[10]  Sebastian Maneth,et al.  XPath whole query optimization , 2010, Proc. VLDB Endow..

[11]  Stefan Böttcher,et al.  Evaluating XPath Queries on XML Data Streams , 2007, BNCOD.

[12]  Makoto Onizuka,et al.  Processing XPath queries with forward and downward axes over XML streams , 2010, EDBT '10.

[13]  François Bry,et al.  An evaluation of regular path expressions with qualifiers against XML streams , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[14]  Marcus Fontoura,et al.  On the memory requirements of XPath evaluation over XML streams , 2007, J. Comput. Syst. Sci..

[15]  Rajeev Rastogi,et al.  Efficient filtering of XML documents with XPath expressions , 2002, The VLDB Journal.

[16]  Mikolaj Bojanczyk,et al.  XPath evaluation in linear time , 2011, JACM.

[17]  Riham Abdel Kader,et al.  ROX: run-time optimization of XQueries , 2009, SIGMOD Conference.

[18]  Yanlei Diao,et al.  Towards an Internet-Scale XML Dissemination Service , 2004, VLDB.

[19]  Sudarshan S. Chawathe,et al.  XPath queries on streaming data , 2003, SIGMOD '03.

[20]  Jun'ichi Tatemura,et al.  AFilter: adaptable XML filtering with prefix-caching suffix-clustering , 2006, VLDB.

[21]  Marcus Fontoura,et al.  Querying XML streams , 2005, The VLDB Journal.

[22]  Susan B. Davidson,et al.  An Efficient XPath Query Processor for XML Streams , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[23]  Alon Y. Halevy,et al.  An XML query engine for network-bound data , 2002, The VLDB Journal.

[24]  Dan Suciu,et al.  XMLTK: An XML Toolkit for Scalable XML Stream Processing , 2002 .