Typechecking XML views of relational databases

Motivated by the need to export relational databases as XML data in the context of the Web, we investigate the typechecking problem for transformations of relational data into tree data (XML). The problem consists of statically verifying that the output of every transformation belongs to a given output tree language (specified for XML by a DTD), for input databases satisfying given integrity constraints. The typechecking problem is parameterized by the class of formulas defining the transformation, the class of output tree languages, and the class of integrity constraints. While undecidable in its most general formulation, the typechecking problem has many special cases of practical interest that turn out to be decidable. The main contribution of this article is to trace a fairly tight boundary of decidability for typechecking in this framework. In the decidable cases we examine the complexity, and show lower and upper bounds. We also exhibit a practically appealing restriction for which typechecking is in PTIME.

[1]  Frank Plumpton Ramsey,et al.  On a Problem of Formal Logic , 1930 .

[2]  J. Büchi Weak Second‐Order Arithmetic and Finite Automata , 1960 .

[3]  E. F. CODD,et al.  A relational model of data for large shared data banks , 1970, CACM.

[4]  Robert McNaughton,et al.  Counter-Free Automata (M.I.T. research monograph no. 65) , 1971 .

[5]  R. McNaughton,et al.  Counter-Free Automata , 1971 .

[6]  Neil Immerman Upper and lower bounds for first order expressibility , 1980, 21st Annual Symposium on Foundations of Computer Science (sfcs 1980).

[7]  E. F. Codd,et al.  A relational model of data for large shared data banks , 1970, CACM.

[8]  Christos H. Papadimitriou,et al.  A note the expressive power of Prolog , 1985, Bull. EATCS.

[9]  J. Spencer Ramsey Theory , 1990 .

[10]  R. Graham,et al.  Ramsey theory (2nd ed.) , 1990 .

[11]  Ron van der Meyden,et al.  The complexity of querying indefinite data about linearly ordered domains , 1992, J. Comput. Syst. Sci..

[12]  Yuri Matiyasevich,et al.  Hilbert’s tenth problem , 2019, 100 Years of Math Milestones.

[13]  Yuri Gurevich,et al.  The Classical Decision Problem , 1997, Perspectives in Mathematical Logic.

[14]  Serge Abiteboul,et al.  Foundations of Databases , 1994 .

[15]  Moshe Y. Vardi On the complexity of bounded-variable queries (extended abstract) , 1995, PODS '95.

[16]  Moshe Y. Vardi On the Complexity of Bounded-Variable Queries. , 1995, PODS 1995.

[17]  Jörg Flum,et al.  Finite model theory , 1995, Perspectives in Mathematical Logic.

[18]  John C. Mitchell,et al.  Foundations for programming languages , 1996, Foundation of computing series.

[19]  Wolfgang Thomas,et al.  Languages, Automata, and Logic , 1997, Handbook of Formal Languages.

[20]  L. Staiger Languages , 1997, Practice and Procedure of the International Criminal Tribunal for the Former Yugoslavia.

[21]  Ron van der Meyden The Complexity of Querying Indefinite Data about Linearly Ordered Domains , 1997, J. Comput. Syst. Sci..

[22]  Sophie Cluet,et al.  Your mediators need data conversion! , 1998, SIGMOD '98.

[23]  Alin Deutsch,et al.  A Query Language for XML , 1999, Comput. Networks.

[24]  Derick Wood,et al.  Regular Tree Languages Over Non-Ranked Alphabets , 1998 .

[25]  Tova Milo,et al.  Using Schema Matching to Simplify Heterogeneous Data Translation , 1998, VLDB.

[26]  Thomas Schwentick,et al.  XML schemas without order , 1999 .

[27]  Arvind Malhotra,et al.  Xml schema part 2: datatypes , 1999 .

[28]  Dan Suciu,et al.  Data on the Web: From Relations to Semistructured Data and XML , 1999 .

[29]  Alin Deutsch,et al.  Querying XML Data , 1999, IEEE Data Eng. Bull..

[30]  Catriel Beeri,et al.  Schemas for Integration and Translation of Structured and Semi-structured Data , 1999, ICDT.

[31]  Michael J. Carey,et al.  XPERANTO: Publishing Object-Relational Data as XML , 2000, WebDB.

[32]  Michael J. Carey,et al.  XPERANTO: Middleware for Publishing Object-Relational Data as XML Documents , 2000, VLDB.

[33]  Yannis Papakonstantinou,et al.  DTD inference for views of XML data , 2000, PODS.

[34]  Dan Suciu,et al.  SilkRoute: trading between relations and XML , 2000, Comput. Networks.

[35]  Eugene J. Shekita,et al.  Querying XML Views of Relational Data , 2001, VLDB.

[36]  Hamid Pirahesh,et al.  Efficiently publishing relational data as XML documents , 2001, The VLDB Journal.

[37]  PiraheshHamid,et al.  Efficiently publishing relational data as XML documents , 2001, VLDB 2001.

[38]  Derick Wood,et al.  Regular tree and regular hedge languages over unranked alphabets , 2001 .

[39]  Eric van der Vlist,et al.  XML Schema , 2002 .

[40]  Dan Suciu,et al.  SilkRoute: A framework for publishing relational data in XML , 2002, TODS.

[41]  Nicolás Marín,et al.  Review of Data on the Web: from relational to semistructured data and XML by Serge Abiteboul, Peter Buneman, and Dan Suciu. Morgan Kaufmann 1999. , 2003, SGMD.

[42]  Donald D. Chamberlin,et al.  XQuery: a query language for XML , 2003, SIGMOD '03.

[43]  Dan Suciu,et al.  Typechecking for XML transformers , 2000, PODS '00.

[44]  Noga Alon,et al.  XML with data values: typechecking revisited , 2003, J. Comput. Syst. Sci..

[45]  Dan A. Simovici Review of "The classical decision problem" by Egon Börger,Erich Grädel and Yuri Gurevich. Springer-Verlag 1997. , 2004, SIGA.