Assisting the Design of XML Schema: Diagnosing Nondeterministic Content Models

One difficulty in the design of XML Schema is the restriction that the content models should be deterministic, i.e., the unique particle attribution (UPA) constraint, which means that the content models are deterministic regular expressions. This determinism is defined semantically without known syntactic definition for it, thus making it difficult for users to design. Presently however, no work can provide diagnostic information if content models are nondeterministic, although this will be of great help for designers to understand and modify nondeterministic ones. In the paper we investigate algorithms that check if a regular expression is deterministic and provide diagnostic information if the expression is not deterministic. With the information provided by the algorithms, designers will be clearer about why an expression is not deterministic. Thus it contributes to reducing the difficulty of designing XML Schema.