A data-driven approach to pronominal anaphora resolution for German

This paper reports on a hybrid architecture for computational anaphora resolution (CAR) of German that combines a rule-based pre-filtering component with a memory-based resolution module (using the Tilburg Memory Based Learner – TiMBL). The data source is provided by the TuBa-D/Z treebank of German newspaper text (Telljohann et al. 04) that is annotated with anaphoric relations. The CAR experiments performed on these treebank data corroborate the importance of modelling aspects of discourse structure for robust, data-driven anaphora resolution. The best result with an F-measure of 0.734 achieved by these experiments outperforms the results reported by (Schiehlen 04), the only other study of German CAR that is based on newspaper treebank data.

[1]  Claire Gardent,et al.  Improving Machine Learning Approaches to Coreference Resolution , 2002, ACL.

[2]  Branimir Boguraev,et al.  Anaphora for Everyone: Pronominal Anaphora Resolution without a Parser , 1996, COLING.

[3]  Wendy G. Lehnert,et al.  Using Decision Trees for Coreference Resolution , 1995, IJCAI.

[4]  Udo Hahn,et al.  Functional Centering - Grounding Referential Coherence in Information Structure , 1999, Comput. Linguistics.

[5]  Walter Daelemans,et al.  TiMBL: Tilburg Memory-Based Learner, version 2.0, Reference guide , 1998 .

[6]  Walter Daelemans,et al.  Parameter optimization for machine-learning of word sense disambiguation , 2002, Natural Language Engineering.

[7]  Andrew Kehler,et al.  Probabilistic Coreference in Information Extraction , 1997, EMNLP.

[8]  Sandra Kübler,et al.  Recent Developments in Linguistic Annotations of the TüBa-D / Z Treebank , 1999 .

[9]  James F. Allen,et al.  Empirical evaluations of pronoun resolution , 2005 .

[10]  John Hale,et al.  A Statistical Approach to Anaphora Resolution , 1998, VLC@COLING/ACL.

[11]  Hwee Tou Ng,et al.  A Machine Learning Approach to Coreference Resolution of Noun Phrases , 2001, CL.

[12]  Wolfgang Menzel,et al.  A broad-coverage parser for German based on defeasible constraints , 2008 .

[13]  Michael Strube,et al.  A Machine Learning Approach to Pronoun Resolution in Spoken Dialogue , 2003, ACL.

[14]  Erhard W. Hinrichs,et al.  The Tüba-D/Z Treebank: Annotating German with a Context-Free Backbone , 2004, LREC.

[15]  Julia Trushkina Morpho-syntactic annotation and dependency parsing of German , 2004 .

[16]  Jean-Pierre Chanod,et al.  Robustness beyond shallowness: incremental deep parsing , 2002, Natural Language Engineering.

[17]  Michael Schiehlen,et al.  Optimizing Algorithms for Pronoun Resolution , 2004, COLING.

[18]  Shalom Lappin,et al.  An Algorithm for Pronominal Anaphora Resolution , 1994, CL.

[19]  Beata Kouchnir,et al.  A Machine Learning Approach to German Pronoun Resolution , 2004, ACL.

[20]  Frank Henrik Müller,et al.  A finite-state approach to shallow parsing and grammatical functions annotation of German , 2005 .

[21]  Michael Strube,et al.  Applying Co-Training to Reference Resolution , 2002, ACL.