A flexible, parallel model of natural language generation

This thesis describes a structured connectionist system for natural language generation. 'FIG', short for 'Flexible Incremental Generator', is based on a single network which encodes lexical knowledge, syntactic knowledge, and world knowledge. In the initial state, some nodes representing concepts are sources of activation; this represents the input. Activation flows from these nodes to nodes representing words via the various knowledge structures of the network. When the network settles, the most highly activated word is selected and emitted, activation levels are updated to represent the new current state. This process of settle, emit, and update repeats until all of the input has been conveyed. An utterance is simply the result of successive word choices. Syntactic knowledge is encoded with network structures representing constructions and their constituents. Constituents are linked to words, syntactic categories, relations, and other constructions. Activation flow via these links, and eventually to words, provides for constituency and subcategorization. The links to constituents are gated by 'cursors,' which are updated over time, based on feedback from the words output. This mechanism ensures that words and concepts which are syntactically appropriate become highly activated at the right time, thus causing words to appear in the right order. FIG suggests that the complexity present in most treatments of syntax is unnecessary. For example, it does not assemble or manipulate syntactic structures; constructions affect the utterance only by the activation they transmit, directly or indirectly, to words. Unlike previous generators, FIG handles arbitrarily large inputs, since the number of nodes activated in the initial state makes no difference to its operation; it handles trade-offs among competing goals without additional mechanism, since all computation is in terms of commensurate levels of activation; and it handles interaction among choices easily, since it tends to settle into a state representing a compatible set of choices, due to links among nodes representing such choices. Abstracting from the implementation leads to several design principles for generation, including explicit representation of the current state and pervasive use of parallelism. The design is a weak cognitive model and also points the way to more natural machine translation.