Generation of Extended Bilingual Statistical Reports

During tim past few years we liave been concerned with developing models for the automatic planning and realization of report texts wittlin technical sublanguages of English and French. Since 1987 we have been implementing Meaning-Text language models (MTMs) [6, 7] for the task of realizing sentences from semantic specifications tha t are output by a text planner. A relatively complete MTM implementa t ion for English was tested in the domain of operat ing system audit summaries in tile Gossip project of 1987-89 [3]. At COLING-gO a report was given on the fully operat ional FoG system for generating marine forecasts in both English and French at weather centres in Eastern Canada [1]. The work reported on here concerns the experimental generation of extended bilingual summaries of Canadian statistical data . Our first focus has been on labour force surveys (LFS), where an extensive corpus of published reports in each language is available for empirical study. Tire current LFS system has built on the experience of the two preceding systems, but goes beyond either of them 1. Iu contrast to FoG, but similar to Gossip, LFS uses a semantic net representation of sentences as input to the realization process. Like Gossip, LFS also makes use of theme/ theme constraints to help optimize lexical and syntactic choices during sentence realizatiou. But in contrast to Gossip, which produced only English texts, LFS is bilingual, making use of the conceptual level of representation produced by the planner as an interlingua from which to derive the linguistic semantic representations for texts in the two languages independently. Hence the LFS interlingua is much "deeper" than FoG's deep-syntactic interlingua. This allows us to iutroduce certain semantic differences between English and I,¥ench sentences that we observe in natural " translat ion twin" texts.