Simplifying XPath queries for optimization with regard to the elimination of intersect and except operators

XPath is widely used as an XML query language and is embedded in XQuery expressions and in XSLT stylesheets. In this paper which is an extended version of [Sven Groppe, Stefan Bottcher, Jinghua Groppe, XPath Query Simplification with regard to the elimination of intersect and except operators, in: 3rd International Workshop on XML Schema and Data Management (XSDM 2006) in conjuction with IEEE ICDE 2006, Atlanta, USA, 2006], we propose a rule set which logically simplifies XPath queries by using a heuristic method in order to improve the processing time. Furthermore, we show how to substitute the XPath 2.0 intersect and except operators in a given XPath query with computed filter expressions. A performance evaluation comparing the execution times of the original XPath queries, which contain the intersect and except operators, and of the queries that are the result of our simplification approach shows that, depending on the used query evaluator and on the original query, performance improvements of a factor of up to 350 are possible. Additionally, we prove that XPath 1.0 is closed under complementation and first order complete.

[1]  Gabriel M. Kuper,et al.  Structural properties of XPath fragments , 2003, Theor. Comput. Sci..

[2]  Michael Benedikt,et al.  XPath satisfiability in the presence of DTDs , 2008, JACM.

[3]  Wenfei Fan,et al.  Taming XPath Queries by Minimizing Wildcard Steps , 2004, VLDB.

[4]  Guido Moerkotte Incorporating XSL Processing into Database Engines , 2002, VLDB.

[5]  Sven Groppe,et al.  A Prototype of a Schema-Based XPath Satisfiability Tester , 2006, DEXA.

[6]  Georg Gottlob,et al.  The complexity of XPath query evaluation , 2003, PODS.

[7]  Sven Groppe XML Query reformulation for XPath, XSLT and XQuery , 2005 .

[8]  Sven Groppe,et al.  XPath Query Simplification with regard to the Elimination of Intersect and Except Operators , 2006, 22nd International Conference on Data Engineering Workshops (ICDEW'06).

[9]  Peter T. Wood,et al.  Containment for XPath Fragments under DTD Constraints , 2003, ICDT.

[10]  Stefan Böttcher,et al.  Testing intersection of XPath expressions under DTDs , 2004, Proceedings. International Database Engineering and Applications Symposium, 2004. IDEAS '04..

[11]  Sven Groppe,et al.  Filtering unsatisfiable XPath queries , 2006, Data Knowl. Eng..

[12]  Laks V. S. Lakshmanan,et al.  Minimization of tree pattern queries , 2001, SIGMOD '01.

[13]  Tim Furche,et al.  XPath: Looking Forward , 2002, EDBT Workshops.

[14]  Sven Groppe,et al.  Reformulating XPath queries and XSLT queries on XSLT views , 2006, Data Knowl. Eng..

[15]  Thomas Schwentick,et al.  XPath query containment , 2004, SGMD.

[16]  Steven J. DeRose,et al.  XML Path Language (XPath) Version 1.0 , 1999 .

[17]  Volker Linnemann,et al.  On the intersection of XPath expressions , 2005, 9th International Database Engineering & Application Symposium (IDEAS'05).

[18]  Stefan B ttcher Testing Intersection of XPath Expressions under DTDs , 2004 .

[19]  Alin Deutsch,et al.  Containment and Integrity Constraints for XPath Fragments , 2001 .

[20]  Dan Suciu,et al.  Containment and equivalence for a fragment of XPath , 2004, JACM.

[21]  Sven Groppe,et al.  Schema-based Query Optimization for XQuery Queries , 2005, ADBIS Research Communications.

[22]  Thomas Schwentick,et al.  XPath Containment in the Presence of Disjunction, DTDs, and Variables , 2003, ICDT.

[23]  Keishi Tajima,et al.  Answering XPath Queries over Networks by Sending Minimal Views , 2004, VLDB.

[24]  Maarten Marx,et al.  First Order Paths in Ordered Trees , 2005, ICDT.

[25]  Massimo Franceschet XpathMark: an Xpath benchmark for XMark , 2005 .

[26]  Stefan Böttcher,et al.  Adaptive XML Access Control Based on Query Nesting, Modification and Simplification , 2005, BTW.

[27]  Peter T. Wood Minimising Simple XPath Expressions , 2001, WebDB.