From XQuery to relational logics

Predicate logic has long been seen as a good foundation for querying relational data. This is embodied in the correspondence between relational calculus and first-order logic, and can also be seen in mappings from fragments of the standard relational query language SQL to extensions of first-order logic (e.g. with counting). A key question is what is the analog to this correspondence for querying tree-structured data, as seen, for example, in XML documents. We formalize this as the question of the appropriate logical query language for defining transformations on tree-structured data. The predominant practitioner paradigm for defining such transformations is top-down tree building. This is embodied by the XQuery query language, which builds the output tree in parallel starting at the root, based on variable bindings and nodeset queries in the XPath language. The goal of this article is to compare the expressiveness of top-down tree-building languages based on a benchmark of predicate logic. We start by giving a formalized XQuery XQ that can serve as a representative of the top-down approach. We show that all queries in XQ with only atomic equality are equivalent to first-order interpretations, an analog to first-order logic (FO) in the setting of transformations of tree-structured data. We then consider fragments of atomic XQ. We identify a fragment that maps efficiently into first-order, a fragment that maps into existential first-order logic, and a fragment that maps into the navigationally two-variable fragment of first-order logic—an analog of two-variable logic in the setting where data values are unbounded. When XQ is considered with deep equality, we find that queries can be translated into FO with counting (FO(Cnt)). Translations from XQ to logical languages on relations have a number of consequences. We use them to derive complexity bounds for XQ fragments, and to bound the Boolean expressiveness of XQ fragments.

[1]  Hongjun Lu,et al.  Query translation from XPath to SQL in the presence of recursive DTDs , 2009, The VLDB Journal.

[2]  Michael Benedikt,et al.  Interpreting Tree-to-Tree Queries , 2006, ICALP.

[3]  Kousha Etessami,et al.  First-Order Logic with Two Variables and Unary Temporal Logic , 2002, Inf. Comput..

[4]  Jan Van den Bussche,et al.  Simulation of the nested relational algebra by the flat relational algebra, with an application to the complexity of evaluating powerset algebra expressions , 2001, Theor. Comput. Sci..

[5]  Jan Hidders,et al.  On the Expressive Power of XQuery Fragments , 2005, DBPL.

[6]  Christoph Koch On the complexity of nonrecursive XQuery and functional query languages on complex values , 2006 .

[7]  Limsoon Wong,et al.  Normal Forms and Conservative Extension Properties for Query Languages over Collection Types , 1996, J. Comput. Syst. Sci..

[8]  Jan Hidders,et al.  On the Expressive Power of Node Construction in XQuery , 2005, WebDB.

[9]  Michael Benedikt,et al.  XPath leashed , 2009, CSUR.

[10]  Dirk Van Gucht,et al.  Possibilities and limitations of using flat operators in nested algebra expressions , 1988, PODS '88.

[11]  Daniela Florescu,et al.  Quilt: An XML Query Language for Heterogeneous Data Sources , 2000, WebDB.

[12]  Dan Suciu,et al.  SilkRoute: A framework for publishing relational data in XML , 2002, TODS.

[13]  Jeffrey F. Naughton,et al.  Recursive XML schemas, recursive XML queries, and relational storage: XML-to-SQL query translation , 2004, Proceedings. 20th International Conference on Data Engineering.

[14]  Jörg Flum,et al.  Finite model theory , 1995, Perspectives in Mathematical Logic.

[15]  Dan Suciu,et al.  Equivalence and Normal Forms for the Restricted and Bounded Fixpoint in the Nested Algebra , 2001, Inf. Comput..

[16]  David S. Johnson,et al.  A Catalog of Complexity Classes , 1991, Handbook of Theoretical Computer Science, Volume A: Algorithms and Complexity.

[17]  Christoph Koch,et al.  On the complexity of nonrecursive XQuery and functional query languages on complex values , 2006, TODS.

[18]  Dan Suciu,et al.  The Restricted and Bounded Fixpoint Closures of the Nested Algebra are Equivalent , 1995 .

[19]  Neil Immerman,et al.  On Uniformity within NC¹ , 1990, J. Comput. Syst. Sci..

[20]  Maarten Marx,et al.  First Order Paths in Ordered Trees , 2005, ICDT.

[21]  Leonid Libkin,et al.  Elements of Finite Model Theory , 2004, Texts in Theoretical Computer Science.

[22]  Neil Immerman,et al.  Descriptive Complexity , 1999, Graduate Texts in Computer Science.

[23]  Georg Gottlob,et al.  Conjunctive queries over trees , 2004, JACM.

[24]  Dan Suciu,et al.  The Restricted and Bounded Fixpoint Closures of the Nested Relational Algebra are Equivalent , 1995, DBPL.

[25]  Nicole Schweikardt,et al.  Arithmetic, first-order logic, and counting quantifiers , 2002, TOCL.

[26]  Maarten Marx,et al.  XPath with Conditional Axis Relations , 2004, EDBT.

[27]  Dan Suciu,et al.  Bounded Fixpoints for Complex Objects , 1993, Theor. Comput. Sci..

[28]  Y. Gurevich On Finite Model Theory , 1990 .

[29]  Jeremy Avigad Eliminating definitions and Skolem functions in first-order logic , 2003, TOCL.

[30]  Saharon Shelah,et al.  On the Strength of the Interpretation Method , 1989, J. Symb. Log..

[31]  Jan Hidders,et al.  A Light but Formal Introduction to XQuery , 2004, XSym.

[32]  M. Tamer Özsu,et al.  A comprehensive XQuery to SQL translation using dynamic interval encoding , 2003, SIGMOD '03.

[33]  Jeffrey F. Naughton,et al.  XML-SQL Query Translation Literature: The State of the Art and Open Problems , 2003, Xsym.

[34]  Serge Abiteboul,et al.  Foundations of Databases , 1994 .

[35]  Gerhard Weikum,et al.  ACM Transactions on Database Systems , 2005 .

[36]  Leonid Libkin,et al.  Elements Of Finite Model Theory (Texts in Theoretical Computer Science. An Eatcs Series) , 2004 .

[37]  Jeffrey F. Naughton,et al.  Efficient XML-to-SQL Query Translation: Where to Add the Intelligence? , 2004, VLDB.