Knowledge-based Coreference Resolution for Hungarian

We present a knowledge-based coreference resolution system for noun phrases in Hungarian texts. The system is used as a module in an automated psychological text processing project. Our system uses rules that rely on knowledge from the morphological, syntactic and semantic output of a deep parser and semantic relations form the Hungarian WordNet ontology. We also use rules that rely on Binding Theory, research results in Hungarian psycholinguistics, current research on proper name coreference identification and our own heuristics. We describe the constraints-and-preferences algorithm in detail that attempts to find coreference information for proper names, common nouns, pronouns and zero pronouns in texts. We present evaluation results for our system on a corpus manually annotated with coreference relations. Precision of the resolution of various coreference types reaches up to 80%, while overall recall is 63%. We also present an investigation of the various error types our system produced, along with an analysis of the results.