Simulation-based understanding of texts about equipment

This thesis presents a natural language understanding system, operating in the domain of equipment consisting of mechanical, hydraulic, and electrical elements. The task of the system is to analyze reports regarding the failure, diagnosis and repair of equipment. We argue that a general knowledge of equipment is not sufficient for a full understanding of such reports. As an alternative, we propose a system which relies on a detailed simulation model to support language understanding. We describe the structure of the model and emphasize features specifically required for language understanding. We show how this model can be used in analyzing and determining the referents for complex noun phrases describing equipment parts. We outline the data structures used for concepts which are mentioned in the text but which have no permanent representation in the model, and explain how they are created during the text analysis. Similarly, we discuss the data structures for representing the facts conveyed by the text, and provide algorithms for translating text expressing facts into their representations. We point out the importance of identifying the implicit temporal and causal relations in the text and show how the simulation capabilities of the model support this task. We present a dynamic graphical interface which gives the user insight into the way the input has been understood by the system. Finally, we indicate how our system may be extended to facilitate dynamic (i.e. during the analysis of text) extensions to its data base, and to assist the user in entering new equipment models. Most aspects of the discussed system were implemented on a Symbolics Lisp machine.