Comparing XML path expressions

XPath is the standard declarative language for navigating XML data and returning a set of matching nodes. In the context of XSLT/XQuery analysis, query optimization, and XML type checking, XPath decision problems arise naturally. They notably include XPath comparisons such as equivalence (whether two queries always return the same result), and containment (whether for any tree the result of a particular query is included in the result of a second one).XPath decision problems have attracted a lot of research attention, especially for studying the computational complexity of various XPath fragments. However, what is missing at present is the constructive use of an expressive logic which would allow capturing these decision problems, while providing practically effective decision procedures.In this paper, we propose a logic-based framework for the static analysis of XPath. Specifically, we propose the alternation free modal μ-calculus with converse as the appropriate logic for effectively solving XPath decision problems. We present a translation of a large XPath fragment into μ-calculus, together with practical experiments on the containment using a state-of-the-art EXPTIME decision procedure for μ-calculus satisfiability. These preliminary experiments shed light, for the first time, on the cost of checking the containment in practice. We believe they reveal encouraging results for further static analysis of XML transformations.

[1]  Steven J. DeRose,et al.  XML Path Language (XPath) Version 1.0 , 1999 .

[2]  Pablo Barceló,et al.  Temporal logics over unranked trees , 2005, 20th Annual IEEE Symposium on Logic in Computer Science (LICS' 05).

[3]  Serge Abiteboul,et al.  Regular path queries with constraints , 1997, J. Comput. Syst. Sci..

[4]  Laks V. S. Lakshmanan,et al.  Minimization of tree pattern queries , 2001, SIGMOD '01.

[5]  Dexter Kozen,et al.  RESULTS ON THE PROPOSITIONAL’p-CALCULUS , 2001 .

[6]  Thomas Schwentick,et al.  XPath Containment in the Presence of Disjunction, DTDs, and Variables , 2003, ICDT.

[7]  Masami Hagiya,et al.  A Decision Procedure for the Alternation-Free Two-Way Modal µ-Calculus , 2005, TABLEAUX.

[8]  Pierre Genevès,et al.  A system for the static analysis of XPath , 2006, TOIS.

[9]  Orna Kupferman,et al.  The Weakness of Self-Complementation , 1999, STACS.

[10]  Peter T. Wood,et al.  Containment for XPath Fragments under DTD Constraints , 2003, ICDT.

[11]  Joachim Hammer,et al.  Updatex---an xquery-based language for processing updates in xml , 2004 .

[12]  Maarten Marx,et al.  XPath with Conditional Axis Relations , 2004, EDBT.

[13]  Arto Salomaa,et al.  ICALP'88: Proceedings of the 15th International Colloquium on Automata, Languages and Programming , 1988 .

[14]  M. de Rijke,et al.  PDL for ordered trees , 2005, J. Appl. Non Class. Logics.

[15]  Moshe Y. Vardi Reasoning about The Past with Two-Way Automata , 1998, ICALP.

[16]  Damian Niwinski,et al.  Fixed point characterization of weak monadic logic definable sets of trees , 1992, Tree Automata and Languages.

[17]  Michael Benedikt,et al.  Regular Tree Languages Definable in FO , 2005, STACS.

[18]  Akihiko Tozawa Towards static type checking for XSLT , 2001, DocEng '01.

[19]  A. Grzegorczyk Some classes of recursive functions , 1964 .

[20]  Thomas Schwentick,et al.  XPath query containment , 2004, SGMD.

[21]  Richard E. Ladner,et al.  Propositional Dynamic Logic of Regular Programs , 1979, J. Comput. Syst. Sci..

[22]  Maarten Marx,et al.  Conditional XPath, the first order complete XPath dialect , 2004, PODS.

[23]  Dexter Kozen,et al.  Results on the Propositional µ-Calculus , 1982, ICALP.

[24]  P. Wadler Two semantics for XPath , 2000 .

[25]  Gabriel M. Kuper,et al.  Structural properties of XPath fragments , 2003, Theor. Comput. Sci..

[26]  Michael Benedikt,et al.  XPath satisfiability in the presence of DTDs , 2008, JACM.

[27]  Frank Neven,et al.  Automata theory for XML researchers , 2002, SGMD.

[28]  Georg Gottlob,et al.  Monadic queries over tree-structured data , 2002, Proceedings 17th Annual IEEE Symposium on Logic in Computer Science.

[29]  Massimo Franceschet XPathMark: An XPath Benchmark for the XMark Generated Data , 2005, XSym.

[30]  Edmund M. Clarke,et al.  Design and Synthesis of Synchronization Skeletons Using Branching-Time Temporal Logic , 1981, Logic of Programs.

[31]  Radu Mateescu,et al.  Local Model-Checking of Modal Mu-Calculus on Acyclic Labeled Transition Systems , 2002, TACAS.

[32]  Wenfei Fan,et al.  Secure XML querying with security views , 2004, SIGMOD '04.

[33]  Benjamin C. Pierce,et al.  Type-Based Optimization for Regular Patterns , 2005, DBPL.

[34]  Dexter Kozen,et al.  A finite model theorem for the propositional μ-calculus , 1988, Stud Logica.

[35]  Peter T. Wood,et al.  On the Equivalence of XML Patterns , 2000, Computational Logic.

[36]  E. H. Hutten SEMANTICS , 1953, The British Journal for the Philosophy of Science.

[37]  Georg Gottlob,et al.  Monadic datalog and the expressive power of languages for web information extraction , 2002, JACM.

[38]  John Doner,et al.  Tree Acceptors and Some of Their Applications , 1970, J. Comput. Syst. Sci..

[39]  Frank Neven,et al.  Frontiers of tractability for typechecking simple XML transformations , 2004, PODS.

[40]  James W. Thatcher,et al.  Generalized finite automata theory with an application to a decision problem of second-order logic , 1968, Mathematical systems theory.

[41]  Dan Suciu,et al.  Containment and equivalence for a fragment of XPath , 2004, JACM.