Learning Micro-Planning Rules for Preventive Expressions

Building text planning resources by hand is timeconsuming and difficult. Certainly, a number of planning architectures and their accompanying plan libraries have been implemented, but while the architectures themselves may be reused in a new domain, the library of plans typically cannot. One way to address this problem is to use machine learning techniques to automate the derivation of planning resources for new domains. In this paper, we apply this technique to build microplanning rules for preventative expressions in instructional text. 1 I n t r o d u c t i o n Building text planning resources by hand is timeconsuming and difficult. Certainly, much work has been done in this regard; there are a number of freely available text planning architectures (e.g., Moore and Paris, 1993). It is frequently the case, however, that while the architecture itself can be reused in a new domain, the library of text plans developed for it cannot. In particular, micro-planning rules, those rules that specify the low-level grammatical details of expression, are highly sensitive to variations between sublanguages, and are therefore difficult to reuse. When faced with a new domain in which to generate text, the typical scenario is to perform a * This work is partially supported by the Engineering and Physical Sciences Research Council (EPSRC) Grant J19221, by BC/DAA9 ARC Project 293, and by the Commission of the European Union Grant LRE-62009. t After September 1, Dr. Vander Linden's address will be Department of Mathematics and Computer Science, Calvin College, Grand Rapids, MI 49546, USA. corpus analysis on a representative collection of the text produced by human authors in that domain and to induce a set of micro-planning rules guiding the generation process in accordance with the results. Some fairly simple rules usually jump out of the analysis quickly, mostly based on the analyst's intuitions. For example, in written instructions, user actions are typically expressed as imperatives. Such observations, however, tend to be gross characterisations. More accurate microplanning requires painstaking analysis. In this paper, for example, the micro-planner must distinguish between phrasing such as "Don't do action,V' and "Take care not to do action-X". Without analysis, it is far from clear how this decision can best be made. Some form of automation would clearly be desirable. Unfortunately, corpus analysis techniques are not yet capable of automating the initial phases of the corpus study (nor will they be for the foreseeable future). There are, however, techniques for rule induction which are useful for the later stages of corpus analysis and for implementation. In this paper, we focus on the use of such rule induction techniques in the context of the microplanning of preventative expressions in instructional text. We define what we mean by a preventative expression, and go on to describe a corpus analysis in which we derive three features that predict the grammatical form of such expressions. We then use the C4.5 learning algorithm to construct a micro-planning sub-network appropriate for these expressions. We conclude with an implemented example in which the technical author is allowed to set the relevant features, and the system generates the appropriate expressions in English and in French.

[1]  Cécile Paris,et al.  A Support Tool for Writing Multilingual Instructions , 1995, IJCAI.

[2]  Brigitte Grote Matchmaking: dialogue modelling and speech generation meet , 1996, INLG.

[3]  Henwood,et al.  Author index , 1983, Pharmacology Biochemistry and Behavior.

[4]  Amy Isard,et al.  Transaction and Action Coding in the Map Task Corpus , 1995 .

[5]  Daniel Ansari,et al.  Deriving Procedural and Warning Instructions from Device and Environment Models , 1995, ArXiv.

[6]  Keith Vander Linden,et al.  DRAFTER : An Interactive Support Tool for Writing Multilingual Instructions , 1996 .

[7]  Laurence R. Horn A Natural History of Negation , 1989 .

[8]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[9]  Chris Mellish,et al.  Sources of Flexibility in Dynamic Hypertext Generation , 1996, INLG.

[10]  Toni Rietveld,et al.  Statistical Techniques for the Study of Language and Language Behaviour , 1993 .

[11]  S. Siegel,et al.  Nonparametric Statistics for the Behavioral Sciences , 2022, The SAGE Encyclopedia of Research Design.

[12]  Cécile Paris,et al.  Expressing Procedural Relationships in Multilingual Instructions , 1994, INLG.

[13]  Johanna D. Moore,et al.  Planning Text for Advisory Dialogues: Capturing Intentional and Rhetorical Information , 1993, CL.

[14]  Jean Carletta,et al.  Assessing Agreement on Classification Tasks: The Kappa Statistic , 1996, CL.

[15]  Barbara Di Eugenio,et al.  Understanding natural language instructions: a computational approach to purpose clauses , 1993 .