L'annotation structurelle

In the tradition of computer-aided text analysis, annotation and categorization are among the operations used to enrich the textual material in the course of the analysis, with the help of statistical tools and various comparative reading functions. In general, however, such enrichments are applied to textual units which are single occurrences, context units, or lexical forms, and consist simply in associating properties, attributes, or feature sets to those units. The possibility of defining structures or relations among textual units is seldom considered, even though it allows a strictly larger set of enrichments to be expressible. This is what we call structural annotation. We propose representing structural annotations in the form of stand-off XML documents compliant with the Text Encoding Initiative (TEI) recommendations, and compatible with the research-corpora repository model defined in earlier work. Examples drawn from textual linguistics will illustrate our proposal.