The Standard Generalized Markup Language (SGML) allows users to define document type definitions (DTDs), which are essentially extended context-free grammars in a notation that is similar to extended Backus-Naur form. The right-hand side of a production is called a content model and its semantics can be modified by exceptions. We give precise definitions of the semantics of exceptions and prove that they do not increase the expressive power of SGML. For each DTD with exceptions we can construct a structurally equivalent extended context-free grammar. On the other hand, exceptions are a powerful shorthand notation—eliminating them may cause exponential growth in the size of a DTD.
[1]
Derick Wood,et al.
Theory of computation
,
1986
.
[2]
Alfred V. Aho,et al.
The Theory of Parsing, Translation, and Compiling
,
1972
.
[3]
Anne Brüggemann-Klein,et al.
Unambiguity of Extended Regular Expressions in SGML Document Grammars
,
1993,
ESA.
[4]
Anne Brüggemann-Klein.
Regular Expressions into Finite Automata
,
1993,
Theor. Comput. Sci..
[5]
Derick Wood,et al.
The validation of SGML content models
,
1997
.
[6]
Charles F. Goldfarb,et al.
SGML handbook
,
1990
.
[7]
Serge Abiteboul,et al.
From structured documents to novel query facilities
,
1994,
SIGMOD '94.