Looping caterpillars [semistructured data querying]

There are two main paradigms for querying semi structured data: regular path queries and XPath. The aim of this paper is to provide a synthesis between these two. This synthesis is given by a small addition to tree walk automata and the corresponding caterpillar expressions. These are evaluated on unranked finite sibling-ordered trees. At the expression level we add an operator whose meaning is intersection with the identity relation. This language can express every first-order definable relation and its expressive power is characterized by pebble tree walk automata that cannot inspect pebbles. We also define an expansion of the caterpillar expressions whose expressive power is characterized by ordinary pebble tree walk automata. Combining results from Bloem-Engelfriet and Gottlob-Koch, we also define an XPath like query language which is complete for all MSO definable binary relations.

[1]  Joost Engelfriet,et al.  Tree-Walking Pebble Automata , 1999, Jewels are Forever.

[2]  Thomas Colcombet,et al.  Tree-Walking Automata Cannot Be Determinized , 2006, ICALP.

[3]  Frank Neven,et al.  A formal model for an expressive fragment of XSLT , 2000, Inf. Syst..

[4]  Georg Gottlob,et al.  Monadic datalog and the expressive power of languages for web information extraction , 2002, JACM.

[5]  Neil Immerman Upper and lower bounds for first order expressibility , 1980, 21st Annual Symposium on Foundations of Computer Science (sfcs 1980).

[6]  Alfred V. Aho,et al.  Translations on a context free grammar , 1969, STOC.

[7]  Joost Engelfriet,et al.  Characterization of Properties and Relations defined in Monadic Second Order Logic on the Nodes of T , 1997 .

[8]  Maarten Marx,et al.  First Order Paths in Ordered Trees , 2005, ICDT.

[9]  Maarten Marx,et al.  Relation Algebra with Binders , 2001, J. Log. Comput..

[10]  Thomas Schwentick,et al.  On the power of tree-walking automata , 2000, Inf. Comput..

[11]  Bernhard Beckert,et al.  Dynamic Logic , 2007, The KeY Approach.

[12]  Jan-Pascal van Best,et al.  Trips on Trees , 1999, Acta Cybern..

[13]  Dan Suciu,et al.  Typechecking for XML transformers , 2000, J. Comput. Syst. Sci..

[14]  Marcus Kracht Inessential Features , 1996, LACL.

[15]  Georg Gottlob,et al.  A Formal Comparison of Visual Web Wrapper Generators , 2003, SOFSEM.

[16]  J W Ballard,et al.  Data on the web? , 1995, Science.

[17]  Steven J. DeRose,et al.  XML Path Language (XPath) Version 1.0 , 1999 .

[18]  Christoph Koch,et al.  Efficient Processing of Expressive Node-Selecting Queries on XML Data in Secondary Storage: A Tree Automata-based Approach , 2003, VLDB.

[19]  Maarten Marx,et al.  XPath with Conditional Axis Relations , 2004, EDBT.

[20]  Patrick Blackburn,et al.  Representation, Reasoning, and Relational Structures: a Hybrid Logic Manifesto , 2000, Log. J. IGPL.

[21]  Frank Neven,et al.  Automata, Logic, and XML , 2002, CSL.

[22]  Derick Wood,et al.  Caterpillars, context, tree automata and tree pattern matching , 2000, Developments in Language Theory.

[23]  Thomas Colcombet,et al.  Tree-walking automata do not recognize all regular languages , 2005, STOC '05.

[24]  Georg Gottlob,et al.  Efficient Algorithms for Processing XPath Queries , 2002, VLDB.