Which XML Schemas Admit 1-Pass Preorder Typing?

It is shown that the class of regular tree languages admitting one-pass preorder typing is exactly the class defined by restrained competition tree grammars introduced by Murata et al. [14]. In a streaming context, the former is the largest class of XSDs where every element in a document can be typed when its opening tag is met. The main technical machinery consists of semantical characterizations of restrained competition grammars and their subclasses. In particular, they can be characterized in terms of the context of nodes, closure properties, allowed patterns and guarded DTDs. It is further shown that deciding whether a schema is restrained competition is tractable. Deciding whether a schema is equivalent to a restrained competition tree grammar, or one of its subclasses, is much more difficult: it is complete for EXPTIME. We show that our semantical characterizations allow for easy optimization and minimization algorithms. Finally, we relate the notion of one-pass preorder typing to the existing XML Schema standard.

[1]  Albert R. Meyer,et al.  Word problems requiring exponential time(Preliminary Report) , 1973, STOC.

[2]  Helmut Seidl Deciding Equivalence of Finite Tree Automata , 1990, SIAM J. Comput..

[3]  Alex K. Simpson,et al.  Computational Adequacy in an Elementary Topos , 1998, CSL.

[4]  Derick Wood,et al.  One-Unambiguous Regular Languages , 1998, Inf. Comput..

[5]  Yannis Papakonstantinou,et al.  DTD inference for views of XML data , 2000, PODS.

[6]  Typechecking for Semistructured Data , 2001, DBPL.

[7]  Derick Wood,et al.  Regular tree and regular hedge languages over unranked alphabets , 2001 .

[8]  Victor Vianu,et al.  Validating streaming XML documents , 2002, PODS.

[9]  Frank Neven,et al.  Automata, Logic, and XML , 2002, CSL.

[10]  Eric van der Vlist,et al.  XML Schema , 2002 .

[11]  Eric van der Vlist,et al.  Relax NG , 2003 .

[12]  C. M. Sperberg-McQueen Logic grammars and XML Schema , 2003, Extreme Markup Languages®.

[13]  Frank Neven,et al.  Typechecking Top-Down Uniform Unranked Tree Transducers , 2003, ICDT.

[14]  Yannis Papakonstantinou,et al.  Incremental Validation of XML Documents , 2003, ICDT.

[15]  Benjamin C. Pierce,et al.  XDuce: A statically typed XML processing language , 2003, TOIT.

[16]  Gabriel M. Kuper,et al.  Structural Properties of XPath Fragments , 2003, ICDT.

[17]  Complexity of Decision Problems for Simple Regular Expressions , 2004, MFCS.

[18]  Yannis Papakonstantinou,et al.  Incremental validation of XML documents , 2003, TODS.

[19]  Rajeev Alur,et al.  Visibly pushdown languages , 2004, STOC '04.

[20]  Frank Neven,et al.  Frontiers of tractability for typechecking simple XML transformations , 2004, PODS.

[21]  Jan Kratochvíl,et al.  Mathematical Foundations of Computer Science 2004 , 2004, Lecture Notes in Computer Science.

[22]  Stefanie Scherzinger,et al.  Attribute grammars for scalable query processing on XML streams , 2005, The VLDB Journal.

[23]  Murali Mani,et al.  Taxonomy of XML schema languages using formal language theory , 2005, TOIT.

[24]  Sebastian Maneth,et al.  Efficient Memory Representation of XML Documents , 2005, DBPL.