The organization of knowledge in a multi-lingual, integrated parser (natural language, translation)

A controversy has existed over the interaction of syntax and semantics in natural language understanding systems. On the one hand, theories of integrated parsing have argued that syntactic and semantic processing must take place at the same time. In addition, these theories have also argued that syntactic and semantic knowledge should be mixed together, and that the role of syntax should be completely subservient to semantic processing. On the other hand, opponents of this theory argue that parsing should be more modular, with syntactic and semantic processing taking place separately. Along with this processing modularity, these opponents also argue that syntactic and semantic knowledge should be more modular, and that syntax, since it is largely autonomous from semantics, plays a more important role in natural language understanding. This thesis presents a theory of natural language understanding which is a compromise between these two views. I argue that natural language understanding should be integrated, in the sense that syntactic and semantic processing should take place at the same time. However, instead of mixing syntactic and semantic knowledge together in the knowledge base of a parser, I argue that power can be gained by organizing syntax and semantics as two largely separate bodies of knowledge, which are combined only at the time of processing. The result is a parser which retains the predictive power which is gained by using semantic information during syntactic processing, but which is more robust in parsing complex syntactic constructions, and which is more amenable to the organization of knowledge about more than one language.