Frontiers of tractability for typechecking simple XML transformations

Typechecking consists of statically verifying whether the output of an XML transformation is always conform to an output type for documents satisfying a given input type. We focus on complete algorithms which always produce the correct answer. We consider top-down XML transformations incorporating XPath expressions and abstract document types by grammars and tree automata. By restricting schema languages and transformations, we identify several practical settings for which typechecking can be done in polynomial time. Moreover, the resulting framework provides a rather complete picture as we show that most scenarios cannot be enlarged without rendering the typechecking problem intractable. So, the present research sheds light on when to use fast complete algorithms and when to reside to sound but incomplete ones.

[1]  Frank Neven,et al.  Static analysis of xml transformation and schema languages , 2006 .

[2]  Martín Abadi,et al.  Security analysis of cryptographically controlled access to XML documents , 2005, PODS '05.

[3]  Steven J. DeRose,et al.  XML Path Language (XPath) , 1999 .

[4]  Frank Neven,et al.  A formal model for an expressive fragment of XSLT , 2002, Inf. Syst..

[5]  Joost Engelfriet,et al.  A comparison of pebble tree transducers with macro tree transducers , 2003, Acta Informatica.

[6]  Dan Suciu,et al.  Type inference for queries on semistructured data , 1999, PODS '99.

[7]  Thomas Schwentick,et al.  Complexity of Decision Problems for Simple Regular Expressions , 2004, MFCS.

[8]  Hubert Comon,et al.  Tree automata techniques and applications , 1997 .

[9]  Derick Wood,et al.  Regular tree and regular hedge languages over unranked alphabets , 2001 .

[10]  Dan Suciu Typechecking for Semistructured Data , 2001, DBPL.

[11]  Dan Suciu The XML typechecking problem , 2002, SGMD.

[12]  Dan Suciu,et al.  UnQL: a query language and algebra for semistructured data based on structural recursion , 2000, The VLDB Journal.

[13]  Frank Neven,et al.  On the complexity of typechecking top-down XML transformations , 2005, Theor. Comput. Sci..

[14]  Benjamin C. Pierce,et al.  XDuce: A statically typed XML processing language , 2003, TOIT.

[15]  Dan Suciu,et al.  Containment and equivalence for a fragment of XPath , 2004, JACM.

[16]  Peter T. Wood,et al.  Containment for XPath Fragments under DTD Constraints , 2003, ICDT.

[17]  Yannis Papakonstantinou,et al.  DTD inference for views of XML data , 2000, PODS.

[18]  Ferenc Gécseg,et al.  Tree Languages , 1997, Handbook of Formal Languages.

[19]  Jeffrey D. Ullman,et al.  Introduction to Automata Theory, Languages and Computation , 1979 .

[20]  Albert R. Meyer,et al.  Word problems requiring exponential time(Preliminary Report) , 1973, STOC.

[21]  Wenfei Fan,et al.  Query Optimization for Semistructured Data Using Path Constraints in a Deterministic Data Model , 1999, DBPL.

[22]  Akihiko Tozawa Towards static type checking for XSLT , 2001, DocEng '01.

[23]  Giuseppe Castagna,et al.  CDuce: an XML-centric general-purpose language , 2003, ICFP '03.

[24]  Dexter Kozen,et al.  Lower bounds for natural proof systems , 1977, 18th Annual Symposium on Foundations of Computer Science (sfcs 1977).

[25]  Frank Neven,et al.  Structured Document Transformations Based on XSL , 1999, DBPL.

[26]  Michael Benedikt,et al.  XPath satisfiability in the presence of DTDs , 2008, JACM.

[27]  Eric van der Vlist,et al.  XML Schema , 2002 .

[28]  Stephen A. Cook,et al.  An Observation on Time-Storage Trade Off , 1974, J. Comput. Syst. Sci..

[29]  Michael Sipser,et al.  Introduction to the Theory of Computation , 1996, SIGA.

[30]  Frank Neven,et al.  Automata theory for XML researchers , 2002, SGMD.

[31]  Dongwon Lee,et al.  Reasoning about XML Schema Languages using Formal Language Theory , 2000 .

[32]  Peter T. Wood Minimising Simple XPath Expressions , 2001, WebDB.

[33]  Noga Alon,et al.  XML with data values: typechecking revisited , 2003, J. Comput. Syst. Sci..

[34]  Yannis Papakonstantinou,et al.  Incremental Validation of XML Documents , 2003, ICDT.

[35]  Dan Suciu,et al.  Typechecking for XML transformers , 2000, PODS '00.

[36]  Thomas Schwentick,et al.  XPath Containment in the Presence of Disjunction, DTDs, and Variables , 2003, ICDT.

[37]  Dan Suciu,et al.  Processing XML Streams with Deterministic Automata , 2003, ICDT.

[38]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[39]  Frank Neven,et al.  Typechecking top-down XML transformations: Fixed input or output schemas , 2006, Inf. Comput..

[40]  Noga Alon,et al.  Typechecking XML views of relational databases , 2003, TOCL.