Basic Model Theory of XPath on Data Trees

We investigate model theoretic properties of XPath with data (in)equality tests over the class of data trees, i.e., the class of trees where each node contains a label from a nite alphabet and a data value from an innite domain. We provide notions of (bi)simulations for XPath logics containing the child, descendant, parent and ancestor axes to navigate the tree. We show that these notions precisely characterize the equivalence relation associated with each logic. We study formula complexity measures consisting of the number of nested axes and nested subformulas in a formula; these notions are akin to the notion of quantier rank in rst-order logic. We show characterization results for ne grained notions of equivalence and (bi)simulation that take into account these complexity measures. We also prove that positive fragments of these logics correspond to the formulas preserved under (non-symmetric) simulations. We show that the logic including the child axis is equivalent to the fragment of rst-order logic invariant under the corresponding notion of bisimulation. If upward navigation is allowed the characterization fails but a weaker result can still be established. These results hold over the class of possibly innite data trees and over the class of nite data trees. Besides their intrinsic theoretical value, we argue that bisimulations are useful tools to prove (non)expressivity results for the logics studied here, and we substantiate this claim with examples.

[1]  Jerzy Tiuryn,et al.  Dynamic logic , 2001, SIGA.

[2]  J.F.A.K. van Benthem,et al.  Modal Correspondence Theory , 1977 .

[3]  Mikolaj Bojanczyk,et al.  XPath evaluation in linear time , 2008, PODS.

[4]  M. de Rijke,et al.  Semantic characterizations of navigational XPath , 2005, SGMD.

[5]  Dov M. Gabbay,et al.  Extensions of Classical Logic , 1989 .

[6]  Valentin Goranko,et al.  Model theory of modal logic , 2007, Handbook of Modal Logic.

[7]  George H. L. Fletcher,et al.  Structural characterizations of the semantics of XPath as navigation tool on a document , 2006, PODS.

[8]  David Park,et al.  Concurrency and Automata on Infinite Sequences , 1981, Theoretical Computer Science.

[9]  Michael Benedikt,et al.  XPath leashed , 2009, CSUR.

[10]  Martin Otto,et al.  Bisimulation invariance and finite models , 2006 .

[11]  Robin Milner,et al.  A Calculus of Communicating Systems , 1980, Lecture Notes in Computer Science.

[12]  Anuj Dawar,et al.  Modal characterisation theorems over special classes of frames , 2005, 20th Annual IEEE Symposium on Logic in Computer Science (LICS' 05).

[13]  Diego Figueira,et al.  Reasoning on words and trees with data , 2010 .

[14]  Mikolaj Bojanczyk,et al.  XPath evaluation in linear time , 2011, JACM.

[15]  Diego Figueira,et al.  Decidability of Downward XPath , 2012, TOCL.

[16]  Martin Otto Modal and guarded characterisation theorems over finite transition systems , 2004, Ann. Pure Appl. Log..

[17]  Michael Benedikt,et al.  XPath satisfiability in the presence of DTDs , 2008, JACM.

[18]  F. Honsell,et al.  Set theory with free construction principles , 1983 .

[19]  M. de Rijke,et al.  Modal Logic , 2001, Cambridge Tracts in Theoretical Computer Science.

[20]  Maarten de Rijke,et al.  Simulating Without Negation , 1997, J. Log. Comput..

[21]  Steven J. DeRose,et al.  XML Path Language (XPath) , 1999 .

[22]  Eric Rosen,et al.  Modal Logic over Finite Structures , 1997, J. Log. Lang. Inf..

[23]  Maarten Marx,et al.  XPath with Conditional Axis Relations , 2004, EDBT.

[24]  Thomas Schwentick,et al.  Two-variable logic on data trees and XML reasoning , 2009, JACM.

[25]  Davide Sangiorgi,et al.  On the origins of bisimulation and coinduction , 2009, TOPL.