New learning models for robust reference resolution

An important challenge for the automatic understanding of natural language texts is the correct computation of the discourse entities that are mentioned therein—persons, locations, abstract objects, and so on. The problem of mapping linguistic expressions into these underlying entities is known as reference resolution. Recent years of research in computational reference resolution have seen the emergence of machine learning approaches, which are much more robust and better performing than their rule-based predecessors. Unfortunately, perfect performance are still out of reach for these systems. Broadly defined, the aim of this dissertation is to improve on these existing systems by exploring more advanced machine learning models, which are: (i) able to more adequately encode the structure of the problem, and (ii) allow a better use of the information sources that are given to the system. Starting with the sub-task of anaphora resolution, we propose to model this task as a ranking problem and no longer as a classification problem (as is done in existing systems). A ranker offers a potentially better way to model this task by directly including the comparison between antecedent candidates as part of its training criterion. We find that the ranker delivers significant performance improvements over classification-based systems, and is also computationally more attractive in terms of training time and learning rate than its rivals. The ranking approach is then extended to the larger problem of coreference resolution. To main goal is to see whether the better antecedent selection capabilities offered by the ranking approach can also benefit in the larger coreference resolution task. The extension is two-fold. First, we design various specialized ranker models for different types referential expressions (e.g., pronouns, definite descriptions, proper names). Besides its linguistic appeal, this division of labor has also the potential of learning better model parameters. Second, we augment these rankers with a model that determines the discourse status of mentions and that is used to filter the “non-anaphoric” mentions. As shown by various experiments, this combined strategy results in significant performance improvements over the single-model, classification-based approach on the three main coreference metrics: the standard MUC metric, but also the more representative B3 and CEAF metrics. Finally, we show how the task of coreference resolution can be recast as a linear optimization problem. In particular, we use the framework of Integer Linear Programming (ILP) to: (i) combine the predictions of three local models (namely, a standard pairwise coreference classifier, a discourse status classifier, and a named entity classifier) in a joint, global inference, and (ii) integrate various other global constraints (such as transitivity constraints) to better capture the dependencies between coreference decisions. Tested on the ACE datasets, our ILP formulations deliver significant f-score improvements over both a standard pairwise model, and various models that employ the discourse status and a named entity classifiers in a cascade. These improvements were again found to hold across the three different evalution metrics: MUC, B3, and CEAF. The fact that B3 and CEAF scores were also improved is of particular importance, since these two metrics are much less lenient than MUC in terms of precision errors.

[1]  Hwee Tou Ng,et al.  A Machine Learning Approach to Coreference Resolution of Noun Phrases , 2001, CL.

[2]  Jaime G. Carbonell,et al.  Anaphora Resolution: A Multi-Strategy Approach , 1988, COLING.

[3]  Renata Vieira,et al.  Discourse-New Detectors for Definite Description Resolution: A Survey and a Preliminary Proposal , 2004 .

[4]  David Carter,et al.  Book Reviews: Interpreting Anaphors in Natural Language Texts , 1990, CL.

[5]  Alon Itai,et al.  Automatic Processing of Large Corpora for the Resolution of Anaphora References , 1990, COLING.

[6]  Claire Cardie,et al.  Identifying Anaphoric and Non-Anaphoric Noun Phrases to Improve Coreference Resolution , 2002, COLING.

[7]  Vincent Ng Machine Learning for Coreference Resolution: Recent Successes and Future Challenges , 2003 .

[8]  Pascal Denis,et al.  Names and pops and discourse structure , 2006 .

[9]  Ellen F. Prince,et al.  Toward a taxonomy of given-new information , 1981 .

[10]  Jian Su,et al.  Coreference Resolution Using Competition Learning Approach , 2003, ACL.

[11]  Vincent Ng,et al.  Semantic Class Induction and Coreference Resolution , 2007, ACL.

[12]  Ellen Riloff,et al.  Unsupervised Learning of Contextual Role Knowledge for Coreference Resolution , 2004, NAACL.

[13]  Jerry R. Hobbs,et al.  Pronoun resolution , 1977, SGAR.

[14]  Pascal Denis,et al.  A Ranking Approach to Pronoun Resolution , 2007, IJCAI.

[15]  Michael Collins,et al.  Discriminative Reranking for Natural Language Parsing , 2000, CL.

[16]  Lynette Hirschman,et al.  A Model-Theoretic Coreference Scoring Scheme , 1995, MUC.

[17]  Simone Paolo Ponzetto,et al.  Exploiting Semantic Role Labeling, WordNet and Wikipedia for Coreference Resolution , 2006, NAACL.

[18]  Michael Collins,et al.  New Ranking Algorithms for Parsing and Tagging: Kernels over Discrete Structures, and the Voted Perceptron , 2002, ACL.

[19]  Elaine Rich,et al.  An Architecture for Anaphora Resolution , 1988, ANLP.

[20]  Pascal Denis,et al.  Joint Determination of Anaphoricity and Coreference Resolution using Integer Programming , 2007, NAACL.

[21]  Adam L. Berger,et al.  A Maximum Entropy Approach to Natural Language Processing , 1996, CL.

[22]  David I. Beaver,et al.  The Optimization of Discourse Anaphora , 2004 .

[23]  Jason Baldridge,et al.  Ensemble-based Active Learning for Parse Selection , 2004, NAACL.

[24]  Xiaoqiang Luo,et al.  A Mention-Synchronous Coreference Resolution Algorithm Based On the Bell Tree , 2004, ACL.

[25]  Joel Tetreault,et al.  A Corpus-Based Evaluation of Centering and Pronoun Resolution , 2001, Computational Linguistics.

[26]  Marilyn A. Walker,et al.  Evaluating Discourse Processing Algorithms , 1989, ACL.

[27]  Breck Baldwin,et al.  Algorithms for Scoring Coreference Chains , 1998 .

[28]  Tony McEnery,et al.  Corpus annotation and reference resolution , 1997 .

[29]  Scott Weinstein,et al.  Centering: A Framework for Modeling the Local Coherence of Discourse , 1995, CL.

[30]  Eugene Charniak,et al.  Coarse-to-Fine n-Best Parsing and MaxEnt Discriminative Reranking , 2005, ACL.

[31]  Mirella Lapata,et al.  Constraint-Based Sentence Compression: An Integer Programming Approach , 2006, ACL.

[32]  Vincent Ng,et al.  Machine Learning for Coreference Resolution: From Local Classification to Global Ranking , 2005, ACL.

[33]  Ellen Riloff,et al.  An Empirical Study of Automated Dictionary Construction for Information Extraction in Three Domains , 1996, Artif. Intell..

[34]  Kees van Deemter,et al.  On Coreferring: Coreference in MUC and Related Annotation Schemes , 2000, CL.

[35]  Proceedings of the LFG06 Conference , 2006 .

[36]  Thomas S. Morton,et al.  Using Coreference for Question Answering , 1999, TREC.

[37]  Ruslan Mitkov,et al.  Automatic Anaphora Resolution: Limits, Impediments, and Ways Forward , 2002, PorTAL.

[38]  Andrew McCallum,et al.  Toward Conditional Models of Identity Uncertainty with Application to Proper Noun Coreference , 2003, IIWeb.

[39]  Stephan Oepen,et al.  Statistical Ranking in Tactical Generation , 2006, EMNLP.

[40]  Claire Gardent,et al.  Improving Machine Learning Approaches to Coreference Resolution , 2002, ACL.

[41]  Wendy G. Lehnert,et al.  Using Decision Trees for Coreference Resolution , 1995, IJCAI.

[42]  Ruslan Mitkov,et al.  Robust Pronoun Resolution with Limited Knowledge , 1998, ACL.

[43]  Jian Su,et al.  Improving Pronoun Resolution Using Statistics-Based Semantic Compatibility Information , 2005, ACL.

[44]  Candace L. Sidner,et al.  Attention, Intentions, and the Structure of Discourse , 1986, CL.

[45]  Rob A. van der Sandt,et al.  Presupposition Projection as Anaphora Resolution , 1992, J. Semant..

[46]  Mirella Lapata,et al.  Aggregation via Set Partitioning for Natural Language Generation , 2006, NAACL.

[47]  Stanley F. Chen,et al.  A Gaussian Prior for Smoothing Maximum Entropy Models , 1999 .

[48]  Renata Vieira,et al.  An Empirically-based System for Processing Definite Descriptions , 2000, CL.

[49]  Mitchell P. Marcus,et al.  Maximum entropy models for natural language ambiguity resolution , 1998 .

[50]  Jeanette K. Gundel,et al.  Cognitive Status and the form of Referring Expressions in Discourse , 1993, The Oxford Handbook of Reference.

[51]  M. Brady,et al.  Focusing in the Comprehension of Definite Anaphora , 1983 .

[52]  Carl Pollard,et al.  A Centering Approach to Pronouns , 1987, ACL.

[53]  Vincent Ng,et al.  Learning Noun Phrase Anaphoricity to Improve Conference Resolution: Issues in Representation and Optimization , 2004, ACL.

[54]  Vincent Ng Supervised Ranking for Pronoun Resolution: Some Recent Improvements , 2005, AAAI.

[55]  Eduard Hovy,et al.  Statistical QA - Classifier vs. Re-ranker: What’s the difference? , 2003, ACL 2003.

[56]  Sebastian Riedel,et al.  Incremental Integer Linear Programming for Non-projective Dependency Parsing , 2006, EMNLP.

[57]  Claire Cardie,et al.  Noun Phrase Coreference as Clustering , 1999, EMNLP.

[58]  Christopher D. Manning,et al.  The Leaf Projection Path View of Parse Trees: Exploring String Kernels for HPSG Parse Selection , 2004 .

[59]  Rob Malouf,et al.  A Comparison of Algorithms for Maximum Entropy Parameter Estimation , 2002, CoNLL.

[60]  Nancy Ide,et al.  Veins Theory: A Model of Global Discourse Cohesion and Coherence , 1998, ACL.

[61]  Andrew McCallum,et al.  First-Order Probabilistic Models for Coreference Resolution , 2007, NAACL.

[62]  Thomas S. Morton,et al.  Coreference for NLP Applications , 2000, ACL.

[63]  Dan Klein,et al.  Unsupervised Coreference Resolution in a Nonparametric Bayesian Model , 2007, ACL.

[64]  Andrei Popescu-Belis,et al.  Three New Methods for Evaluating Reference Resolution , 2002, ArXiv.

[65]  Dan Roth,et al.  A Linear Programming Formulation for Global Inference in Natural Language Tasks , 2004, CoNLL.

[66]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.