Discourse-New Detectors for Definite Description Resolution: A Survey and a Preliminary Proposal

Vieira and Poesio (2000) proposed an algorithm for definite description ( DD) resolution that incorporates a number of heuristics for detecting discoursenew descriptions. The inclusion of such detectors was motivated by the observation that more than 50% of definite descriptions ( DDs) in an average corpus are discourse new (Poesio and Vieira, 1998), but whereas the inclusion of detectors for non-anaphoric pronouns in algorithms such as Lappin and Leass’ (1994) leads to clear improvements in precision, the improvements in anaphoric DD resolution (as opposed to classification) brought about by the detectors were rather small. In fact, Ng and Cardie (2002a) challenged the motivation for the inclusion of such detectors, reporting no improvements, or even worse performance. We re-examine the literature on the topic in detail, and propose a revised algorithm, taking advantage of the improved discourse-new detection techniques developed by Uryupina (2003).

[1]  Laurence R. Horn,et al.  Definiteness and Indefiniteness * , 2001 .

[2]  Shalom Lappin,et al.  An Algorithm for Pronominal Anaphora Resolution , 1994, CL.

[3]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[4]  金田 重郎,et al.  C4.5: Programs for Machine Learning (書評) , 1995 .

[5]  Claire Gardent,et al.  Improving Machine Learning Approaches to Coreference Resolution , 2002, ACL.

[6]  Ruslan Mitkov,et al.  Robust Pronoun Resolution with Limited Knowledge , 1998, ACL.

[7]  Barbara Di Eugenio,et al.  Centering: A Parametric Theory and Its Instantiations , 2004, Computational Linguistics.

[8]  Claire Cardie,et al.  Identifying Anaphoric and Non-Anaphoric Noun Phrases to Improve Coreference Resolution , 2002, COLING.

[9]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[10]  Joel Tetreault,et al.  A Corpus-Based Evaluation of Centering and Pronoun Resolution , 2001, Computational Linguistics.

[11]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[12]  Renata Vieira,et al.  An Empirically-based System for Processing Definite Descriptions , 2000, CL.

[13]  Renata Vieira,et al.  A Corpus-based Investigation of Definite Description Use , 1997, CL.

[14]  Ellen Riloff,et al.  Corpus-Based Identification of Non-Anaphoric Noun Phrases , 1999, ACL.

[15]  Ruslan Mitkov Towards More Comprehensive Evaluation in Anaphora Resolution , 2000, LREC.

[16]  E. Prince The ZPG Letter: Subjects, Definiteness, and Information-status , 1992 .

[17]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[18]  Massimo Poesio,et al.  Annotating a Corpus to Develop and Evaluate Discourse Entity Realization Algorithms: Issues and Preliminary Results , 2000, LREC.

[19]  Olga Uryupina,et al.  High-precision Identification of Discourse New and Unique Noun Phrases , 2003, ACL.