A Decision Procedure for XPath Containment

XPath is the standard language for addressing parts of an XML document. We present a sound and complete decision procedure for containment of XPath queries. The considered XPath fragment covers most of the language features used in practice. Specifically, we show how XPath queries can be translated into equivalent formulas in monadic second-order logic. Using this translation, we construct an optimized logical formulation of the containment problem, which is decided using tree automata. When the containment relation does not hold between two XPath expressions, a counter-example XML tree is generated. We provide a complexity analysis together with practical experiments that illustrate the efficiency of the decision procedure for realistic scenarios.

[1]  John Doner,et al.  Tree Acceptors and Some of Their Applications , 1970, J. Comput. Syst. Sci..

[2]  Peter T. Wood,et al.  On the Equivalence of XML Patterns , 2000, Computational Logic.

[3]  Gabriel M. Kuper,et al.  Structural properties of XPath fragments , 2003, Theor. Comput. Sci..

[4]  Michael Benedikt,et al.  XPath satisfiability in the presence of DTDs , 2008, JACM.

[5]  Christine Paulin-Mohring,et al.  The Coq Proof Assistant : A Tutorial : Version 7.2 , 1997 .

[6]  Richard E. Ladner,et al.  Propositional Dynamic Logic of Regular Programs , 1979, J. Comput. Syst. Sci..

[7]  Thomas Schwentick,et al.  XPath Containment in the Presence of Disjunction, DTDs, and Variables , 2003, ICDT.

[8]  Masami Hagiya,et al.  A Decision Procedure for the Alternation-Free Two-Way Modal µ-Calculus , 2005, TABLEAUX.

[9]  Albert R. Meyer,et al.  WEAK MONADIC SECOND ORDER THEORY OF SUCCESSOR IS NOT ELEMENTARY-RECURSIVE , 1973 .

[10]  Michael Benedikt,et al.  Regular Tree Languages Definable in FO , 2005, STACS.

[11]  Frank Neven,et al.  Automata theory for XML researchers , 2002, SGMD.

[12]  Maarten Marx,et al.  Conditional XPath, the first order complete XPath dialect , 2004, PODS.

[13]  Nils Klarlund,et al.  MONA Implementation Secrets , 2000, Int. J. Found. Comput. Sci..

[14]  Larry Joseph Stockmeyer,et al.  The complexity of decision problems in automata theory and logic , 1974 .

[15]  M. de Rijke,et al.  PDL for ordered trees , 2005, J. Appl. Non Class. Logics.

[16]  Peter T. Wood,et al.  Containment for XPath Fragments under DTD Constraints , 2003, ICDT.

[17]  Benjamin C. Pierce,et al.  Regular expression types for XML , 2005, ACM Trans. Program. Lang. Syst..

[18]  Hubert Comon,et al.  Tree automata techniques and applications , 1997 .

[19]  Thomas Schwentick,et al.  XPath query containment , 2004, SGMD.

[20]  Edmund M. Clarke,et al.  Design and Synthesis of Synchronization Skeletons Using Branching Time Temporal Logic , 2008, 25 Years of Model Checking.

[21]  J. Büchi Weak Second‐Order Arithmetic and Finite Automata , 1960 .

[22]  Pablo Barceló,et al.  Temporal logics over unranked trees , 2005, 20th Annual IEEE Symposium on Logic in Computer Science (LICS' 05).

[23]  Nils Klarlund,et al.  Algorithms for Guided Tree Automata , 1996, Workshop on Implementing Automata.

[24]  Laks V. S. Lakshmanan,et al.  Minimization of tree pattern queries , 2001, SIGMOD '01.

[25]  Pierre Genevès,et al.  XPath Formal Semantics and Beyond: a Coq based approach , 2004 .

[26]  P. Wadler Two semantics for XPath , 2000 .

[27]  Akihiko Tozawa Towards static type checking for XSLT , 2001, DocEng '01.

[28]  A. Grzegorczyk Some classes of recursive functions , 1964 .

[29]  Wenfei Fan,et al.  Secure XML querying with security views , 2004, SIGMOD '04.

[30]  James W. Thatcher,et al.  Generalized finite automata theory with an application to a decision problem of second-order logic , 1968, Mathematical systems theory.

[31]  Dan Suciu,et al.  Containment and equivalence for a fragment of XPath , 2004, JACM.

[32]  C. C. Elgot Decision problems of finite automata design and related arithmetics , 1961 .

[33]  Randal E. Bryant,et al.  Graph-Based Algorithms for Boolean Function Manipulation , 1986, IEEE Transactions on Computers.

[34]  Nils Klarlund,et al.  Mona & Fido: The Logic-Automaton Connection in Practice , 1997, CSL.

[35]  Steven J. DeRose,et al.  XML Path Language (XPath) Version 1.0 , 1999 .

[36]  Michael I. Schwartzbach,et al.  Compile-Time Debugging of C Programs Working on Trees , 2000, ESOP.

[37]  Maarten Marx,et al.  XPath with Conditional Axis Relations , 2004, EDBT.