Signalling in written text: a corpus-based approach

The concern of this paper is the signalling of segments and relations in written texts. It explores the role of visual formatting and its relation to lexical and other markers. Through a corpus-based study of a specific "text object" definitions in instructional texts, it brings together two models of text structure: RST and the model of text architecture. Unlike RST, this latter model gives a central place to signalling, establishing a theoretically-motivated relation of functional equivalence between markers based on typography or layout and lexico-syntactic markers. Definitions in the corpus are characterised on the basis of configurations of markers, and their occurrences charted in the global structure of the text. The distribution of definition patterns highlights the dynamic nature of text: markers of a specific text object vary systematically according to where it occurs in the structural hierarchy of the text. The study establishes a relation between text objects and RST segments, thus opening the range of discourse markers to include visual formatting, and providing RST segments with a textual status. I n t r o d u c t i o n Discourse relations are heterogeneous; text organisation seems to work on several distinct levels (Cf. Moore and Pollack 1992). This complexity has been the focus of much research recently, with a number of authors appealing to Halliday's tripartite distinction of linguistic metafunctions ideational, interpersonal and textual in order to articulate different perspectives on discourse organisation, or different levels of description (Maier and Ho W 1993, Bateman and Rondhuis 1997). These authors explored ways in which the metafunctions could provide an organising principle for the classification of discourse relations and markers (otherwise classified as semantic vs. pragmatic, subject-matter vs. presentational, etc.). The textual metafunction, described by Halliday and Hasan (1976) as "the text-forming component in the linguistic system", comprising "the resources that language has for creating text" (ibid: 26) has tended to receive the least developed treatment. The focus of this paper is the textual metafunction, and its aim is to contribute to an understanding of the "resources" that are exploited to create textual meaning, more specifically markers of relations and segment boundaries. My approach belongs in corpus linguistics, and is therctore guided by an awareness of the diversity of language productions. A first factor of variation is domain: a number of studieg are concerned with the linguistic characterisation of domain sublanguages (Grishman and Kittredge 1986; Sager, Friedman et al. 1987) A second factor is genre, which subsumes social /'unction, discourse purpose, channel. This study focusses on written texts with a specific discourse function i n s t r u c t i o n a l within a particular domain: software manuals. The specificity of written texts and its relevance to an understanding of discourse organisation must be stressed: firstly, in most cases, writing implies that the writer 1 and the intended audience do not share the context of communication. This has two major consequences for the organisation of written text: a) a written text is generally a monologue, where topics are introduced. continued or dropped not through negociation between discourse participants but on the sole basis of the writer's representations and intentions; b) there is a requirement for explicitness in the signalling of the various levels of meaning. Secondly, a written text is a visual object, and its visual properties are directly involved and exploited by readers in the construction of meaning. The choice of instructional texts derives from a hypothesis linked to the explicitness requirement: the social function of these texts is such that their writers are likely to try and leave as little interpretative leeway as possible. They therefore constitute a good starting point for a study of organisational signals. Discourse theorists are generally agreed on a recursive structuring involving text segments and discourse relations. Many questions remain open, however, over the signalling of relations and the nature and status of the segments. In RST. the authors stress the absence of specific signalling of rhetorical relations. As for the segments concerned, the minimal units are defined as "typically clauses", but Mann and Thompson specify that the relations in fact hold between the 1 I use the word writer for convenience, even though the production of a text may involve several agents.