Trade-offs Between Syntactic and Semantic Processing in the Comprehension of Real Texts

Work on the machine-understanding of real texts, such as newspaper stories, has tended to be done either entirely by semantic means (conceptual analyzers, word expert parsers) or with a staged approach first using syntactic processing (part of speech analysis plus traditional or statistical parsing) and then phrase by phrase semantic analysis. The first technique has difficult problems in control structure and can miss the obvious; the second has almost insurmountable problems with structural ambiguity during its parsing phase and must often fall back on statistical methods and good guesses. We present a technique for natural language understanding that is anchored in a lexically-driven syntactic analysis based on semantic grammars. It solves the control problem by constructing its primary grammar through the application of syntactically principled schemas using Tree Adjoining Grammar. Additional default rules of syntactic combination are also allowed when the validity of the combination is undeniable and the semantic interpretation of the new phrase is always the same, as with auxiliary verbs, relative pronouns, and the like. We discuss recent developments in our work, where a richer model of semantic composition allows us to have more of the work done by the syntactic default rules without having to anticipate the semantic interpretation of a particular pair constituents while still yielding a specific, object-oriented model of the information in the understood text.