Evaluation-driven design of a robust coreference resolution system

In this paper, we describe a system for coreference resolution and emphasize the role of evaluation for its design. The goal of the system is to group referring expressions (identified beforehand in narrative texts) into sets of coreferring expressions that correspond to discourse entities. Several knowledge sources are distinguished, such as referential compatibility between a referring expression and a discourse entity, activation factors for discourse entities, size of working memory, or meta-rules for the creation of discourse entities. For each of them, the theoretical analysis of its relevance is compared to scores obtained through evaluation. After looping through all knowledge sources, an optimal behavior is chosen, then evaluated on test data. The paper also discusses evaluation measures as well as data annotation, and compares the present approach to others in the field.

[1]  Noam Chomsky,et al.  Lectures on Government and Binding , 1981 .

[2]  Yorick Wilks,et al.  University of Sheffield: Description of the LaSIE System as Used for MUC-6 , 1995, MUC.

[3]  Breck Baldwin,et al.  CogNIAC: high precision coreference with limited knowledge and linguistic resources , 1997 .

[4]  Christopher Gauker Language and Reality: An Introduction to the Philosophy of Language , 1987 .

[5]  Scott Bennett,et al.  Applying machine learning to anaphora resolution , 1995, Learning for Natural Language Processing.

[6]  Candace L. Sidner,et al.  Towards a computational theory of definite anaphora comprehension in English discourse , 1979 .

[7]  Donia Scott,et al.  Book Reviews: Generating Referring Expressions , 1994, CL.

[8]  Andrei Popescu-Belis,et al.  Evaluation numérique de la résolution de la référence : Critiques et propositions , 1999 .

[9]  Breck Baldwin,et al.  University of Pennsylvania: description of the University of Pennsylvania system used for MUC-6 , 1995, MUC.

[10]  Yorick Wilks,et al.  Software Infrastructure for Natural Language Processing , 1997, ANLP.

[11]  Douglas E. Appelt,et al.  A Computational Model of Referring , 1987, IJCAI.

[12]  Sanda M. Harabagiu,et al.  Knowledge-Lean Coreference Resolution and its Relation to Textual Cohesion and Coherence , 1999, Workshop On The Relation Of Discourse/Dialogue Structure And Reference.

[13]  Dekang Lin,et al.  University of Manitoba: Description of the PIE System Used for MUC-6 , 1995, MUC.

[14]  Andrei Popescu-Belis,et al.  Reference Resolution beyond Coreference: a Conceptual Frame and its Application , 1998, COLING-ACL.

[15]  Shalom Lappin,et al.  An Algorithm for Pronominal Anaphora Resolution , 1994, CL.

[16]  Barbara J. Grosz,et al.  Focusing and Description in Natural Language Dialogues , 1979 .

[17]  Candace L. Sidner,et al.  Attention, Intentions, and the Structure of Discourse , 1986, CL.

[18]  Wendy G. Lehnert,et al.  Using Decision Trees for Coreference Resolution , 1995, IJCAI.

[19]  Ruslan Mitkov,et al.  Robust Pronoun Resolution with Limited Knowledge , 1998, ACL.

[20]  Claire Cardie,et al.  Noun Phrase Coreference as Clustering , 1999, EMNLP.

[21]  Ruslan Mitkov,et al.  Pronoun resolution: The practical alternative , 2000 .

[22]  Robert J. Gaizauskas,et al.  Evaluating a Focus-Based Approach to Anaphora Resolution , 1998, COLING-ACL.

[23]  Susanne Salmon-Alt Entre corpus et théorie : l'annotation (co)référentielle , 2000 .

[24]  Amichai Kronfeld,et al.  Donnellan's Distinction and a Computational Model of Reference , 1986, ACL.

[25]  Yorick Wilks,et al.  University of Sheffield: description of the LaSIE system as used for MUC-6 , 1995, MUC.

[26]  David Fisher,et al.  Description of the UMass system as used for MUC-6 , 1995, MUC.

[27]  Andrei Popescu-Belis How Corpora with Annotated Coreference Links Improve Reference Resolution , 1998 .

[28]  Rebecca J. Passonneau Applying Reliability Metrics to Co-Reference Annotation , 1997, ArXiv.

[29]  Hwee Tou Ng,et al.  A Machine Learning Approach to Coreference Resolution of Noun Phrases , 2001, CL.

[30]  Robert Dale,et al.  Generating referring expressions in a domain of objects and processes (language representation) , 1988 .

[31]  John Hale,et al.  A Statistical Approach to Anaphora Resolution , 1998, VLC@COLING/ACL.

[32]  Mira Ariel Accessing Noun-Phrase Antecedents , 1990 .

[33]  Hiyan Alshawi,et al.  Memory and context for language interpretation , 1987 .

[34]  R. Passonneau Using Centering to Relax Gricean Informational Constraints on Discourse Anaphoric Noun Phrases , 1996 .

[35]  R. Mitkov,et al.  Coreference and anaphora: developing annotating tools, annotated resources and annotation strategies , 2000 .

[36]  Laurent Romary,et al.  Codage des références et coréférences dans les DHM , 1997 .

[37]  Andrei Popescu-Belis,et al.  Cooperation between Pronoun and Reference Resolution for Unrestricted Texts , 2002, ArXiv.

[38]  Scott Weinstein,et al.  Centering: A Framework for Modeling the Local Coherence of Discourse , 1995, CL.

[39]  Breck Baldwin,et al.  Description of the UPENN CAMP System as Used for Coreference , 1998, MUC.

[40]  M. Brady,et al.  Focusing in the Comprehension of Definite Anaphora , 1983 .

[41]  Alon Itai,et al.  Automatic Processing of Large Corpora for the Resolution of Anaphora References , 1990, COLING.

[42]  Douglas E. Appelt,et al.  Planning English Referring Expressions , 1985, Artif. Intell..

[43]  Susann LuperFoy,et al.  The Representation of Multimodal User Interface Dialogues Using Discourse Pegs , 1992, ACL.

[44]  Andrei Popescu-Belis Modelisation multi-agents des echanges langagiers : application au probleme de la reference et a son evaluation , 1999 .

[45]  Renata Vieira,et al.  A Corpus-based Investigation of Definite Description Use , 1997, CL.

[46]  Robert B. Ash,et al.  Information Theory , 2020, The SAGE International Encyclopedia of Mass Media and Society.

[47]  Dan Cristea,et al.  AR-Engine - a framework for unrestricted co-reference resolution , 2002, LREC.

[48]  Howard Lasnik Essays On Anaphora , 1989 .

[49]  R. Jackendoff Foundations of Language: Brain, Meaning, Grammar, Evolution , 2002 .

[50]  Xavier Briffault,et al.  An Object-Oriented Linguistic Engineering Environment using LFG (Lexical Functionnal Grammar) and CG (Conceptual Graphs) , 1997 .

[51]  Barbara J. Grosz,et al.  The representation and use of focus in dialogue understanding. , 1977 .

[52]  Branimir Boguraev,et al.  Anaphora in a Wider Context: Tracking Discourse Referents , 1996, ECAI.

[53]  Candace L. Sidner,et al.  Focusing for Interpretation of Pronouns , 1981, CL.

[54]  Breck Baldwin,et al.  Algorithms for Scoring Coreference Chains , 1998 .

[55]  Kees van Deemter,et al.  On Coreferring: Coreference in MUC and Related Annotation Schemes , 2000, CL.

[56]  Renata Vieira,et al.  An Empirically-based System for Processing Definite Descriptions , 2000, CL.

[57]  Eric Gaussier,et al.  Annotating a large corpus with anaphoric links , 2000 .

[58]  Ruslan Mitkov,et al.  Towards a more consistent and comprehensive evaluation of anaphora resolution algorithms and systems , 2001, Appl. Artif. Intell..

[59]  Jean Carletta,et al.  Assessing Agreement on Classification Tasks: The Kappa Statistic , 1996, CL.

[60]  Robert J. Gaizauskas,et al.  Using a semantic network for information extraction , 1997, Natural Language Engineering.

[61]  Ludovic Tanguy,et al.  Discourse Data in DiET , 1999 .

[62]  Uwe Reyle,et al.  From Discourse to Logic - Introduction to Modeltheoretic Semantics of Natural Language, Formal Logic and Discourse Representation Theory , 1993, Studies in linguistics and philosophy.

[63]  Andrei Popescu-Belis How Corpora with Annotated Coreference Links lmprove Reference , 1998 .

[64]  Klaus Krippendorff,et al.  Content Analysis: An Introduction to Its Methodology , 1980 .

[65]  Lynette Hirschman,et al.  Mixed-Initiative Development of Language Processing Systems , 1997, ANLP.

[66]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[67]  Lynette Hirschman,et al.  Automating Coreference: The Role of Annotated Training Data , 1998, ArXiv.

[68]  Z. Harris,et al.  Foundations of language , 1941 .

[69]  Lynette Hirschman,et al.  A Model-Theoretic Coreference Scoring Scheme , 1995, MUC.