Toward Text Understanding: Integrating Relevance-tagged Corpus and Automatically Constructed Case Frames

This paper proposes a wide-range anaphora resolution system toward text understanding. This system resolves zero, direct and indirect anaphors in Japanese texts by integrating two sorts of linguistic resources: a hand-annotated corpus with various relations and automatically constructed case frames. The corpus has relevance tags which consist of predicate-argument relations, relations between nouns and coreferences, and is utilized for learning parameters of the system and testing it. The case frames are indispensable knowledge both for detecting zero/indirect anaphors and estimating appropriate antecedents. Our preliminary experiments showed promising results.