Improving Coreference Resolution by Learning Entity-Level Distributed Representations

A long-standing challenge in coreference resolution has been the incorporation of entity-level information - features defined over clusters of mentions instead of mention pairs. We present a neural network based coreference system that produces high-dimensional vector representations for pairs of coreference clusters. Using these representations, our system learns when combining clusters is desirable. We train the system with a learning-to-search algorithm that teaches it which local decisions (cluster merges) will lead to a high-scoring final coreference partition. The system substantially outperforms the current state-of-the-art on the English and Chinese portions of the CoNLL 2012 Shared Task dataset despite using few hand-engineered features.

[1]  Pascal Denis,et al.  A Ranking Approach to Pronoun Resolution , 2007, IJCAI.

[2]  Vincent Ng,et al.  Narrowing the Modeling Gap: A Cluster-Ranking Approach to Coreference Resolution , 2014, J. Artif. Intell. Res..

[3]  Jian Su,et al.  An Entity-Mention Model for Coreference Resolution with Inductive Logic Programming , 2008, ACL.

[4]  John Langford,et al.  Search-based structured prediction , 2009, Machine Learning.

[5]  Jason Weston,et al.  Learning Anaphoricity and Antecedent Ranking Features for Coreference Resolution , 2015, ACL.

[6]  Alexander M. Rush,et al.  Learning Global Features for Coreference Resolution , 2016, NAACL.

[7]  Dan Klein,et al.  Decentralized Entity-Level Modeling for Coreference Resolution , 2013, ACL.

[8]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[9]  Xiaoqiang Luo,et al.  On Coreference Resolution Performance Metrics , 2005, HLT.

[10]  Dan Roth,et al.  Understanding the Value of Features for Coreference Resolution , 2008, EMNLP.

[11]  Dan Klein,et al.  Easy Victories and Uphill Battles in Coreference Resolution , 2013, EMNLP.

[12]  Michael Strube,et al.  Latent Structures for Coreference Resolution , 2015, TACL.

[13]  Pedro M. Domingos,et al.  Joint Unsupervised Coreference Resolution with Markov Logic , 2008, EMNLP.

[14]  Claire Gardent,et al.  Improving Machine Learning Approaches to Coreference Resolution , 2002, ACL.

[15]  Vincent Ng,et al.  Unsupervised Models for Coreference Resolution , 2008, EMNLP.

[16]  Mitchell P. Marcus,et al.  OntoNotes: The 90% Solution , 2006, NAACL.

[17]  John Langford,et al.  Learning to Search Better than Your Teacher , 2015, ICML.

[18]  Xiaoqiang Luo,et al.  A Mention-Synchronous Coreference Resolution Algorithm Based On the Bell Tree , 2004, ACL.

[19]  Veselin Stoyanov,et al.  Easy-first Coreference Resolution , 2012, COLING.

[20]  Yuchen Zhang,et al.  CoNLL-2012 Shared Task: Modeling Multilingual Unrestricted Coreference in OntoNotes , 2012, EMNLP-CoNLL Shared Task.

[21]  Andrew McCallum,et al.  Toward Conditional Models of Identity Uncertainty with Application to Proper Noun Coreference , 2003, IIWeb.

[22]  Dan Roth,et al.  A Joint Framework for Coreference Resolution and Mention Head Detection , 2015, CoNLL.

[23]  Breck Baldwin,et al.  Algorithms for Scoring Coreference Chains , 1998 .

[24]  Eraldo Rezende Fernandes,et al.  Latent Structure Perceptron with Feature Induction for Unrestricted Coreference Resolution , 2012, EMNLP-CoNLL Shared Task.

[25]  Heeyoung Lee,et al.  A Multi-Pass Sieve for Coreference Resolution , 2010, EMNLP.

[26]  Christopher D. Manning,et al.  Entity-Centric Coreference Resolution with Model Stacking , 2015, ACL.

[27]  John Langford,et al.  Efficient programmable learning to search , 2014, ArXiv.

[28]  Lynette Hirschman,et al.  A Model-Theoretic Coreference Scoring Scheme , 1995, MUC.

[29]  John Langford,et al.  A Credit Assignment Compiler for Joint Prediction , 2014, NIPS.

[30]  Steven Skiena,et al.  Polyglot: Distributed Word Representations for Multilingual NLP , 2013, CoNLL.

[31]  Jonas Kuhn,et al.  Learning Structured Perceptrons for Coreference Resolution with Latent Antecedents and Non-local Features , 2014, ACL.

[32]  Daniel Marcu,et al.  A Large-Scale Exploration of Effective Global Features for a Joint Entity Detection and Tracking Model , 2005, HLT.

[33]  Hwee Tou Ng,et al.  A Machine Learning Approach to Coreference Resolution of Noun Phrases , 2001, CL.

[34]  Vincent Ng,et al.  Supervised Models for Coreference Resolution , 2009, EMNLP.

[35]  John Langford,et al.  Learning to Search for Dependencies , 2015, ArXiv.

[36]  Dan Klein,et al.  Coreference Resolution in a Modular, Entity-Centered Model , 2010, NAACL.

[37]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[38]  Claire Cardie,et al.  Reconcile: A Coreference Resolution Research Platform , 2010 .

[39]  Thomas G. Dietterich,et al.  Prune-and-Score: Learning for Greedy Coreference Resolution , 2014, EMNLP.

[40]  Chen Chen,et al.  Combining the Best of Two Worlds: A Hybrid Approach to Multilingual Coreference Resolution , 2012, EMNLP-CoNLL Shared Task.

[41]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[42]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.