Learning Global Features for Coreference Resolution

There is compelling evidence that coreference prediction would benefit from modeling global information about entity-clusters. Yet, state-of-the-art performance can be achieved with systems treating each mention prediction independently, which we attribute to the inherent difficulty of crafting informative cluster-level features. We instead propose to use recurrent neural networks (RNNs) to learn latent, global representations of entity clusters directly from their mentions. We show that such representations are especially useful for the prediction of pronominal mentions, and can be incorporated into an end-to-end coreference system that outperforms the state of the art without requiring any additional search.

[1]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[2]  Hwee Tou Ng,et al.  A Machine Learning Approach to Coreference Resolution of Noun Phrases , 2001, CL.

[3]  Claire Cardie,et al.  Identifying Anaphoric and Non-Anaphoric Noun Phrases to Improve Coreference Resolution , 2002, COLING.

[4]  Andrew McCallum,et al.  Toward Conditional Models of Identity Uncertainty with Application to Proper Noun Coreference , 2003, IIWeb.

[5]  Philipp Koehn,et al.  Statistical Significance Tests for Machine Translation Evaluation , 2004, EMNLP.

[6]  Mitchell P. Marcus,et al.  OntoNotes: The 90% Solution , 2006, NAACL.

[7]  Andrew McCallum,et al.  First-Order Probabilistic Models for Coreference Resolution , 2007, NAACL.

[8]  Dan Roth,et al.  Understanding the Value of Features for Coreference Resolution , 2008, EMNLP.

[9]  Pascal Denis,et al.  Specialized Models and Ranking for Coreference Resolution , 2008, EMNLP.

[10]  Pedro M. Domingos,et al.  Joint Unsupervised Coreference Resolution with Markov Logic , 2008, EMNLP.

[11]  Vincent Ng,et al.  Unsupervised Models for Coreference Resolution , 2008, EMNLP.

[12]  John Langford,et al.  Search-based structured prediction , 2009, Machine Learning.

[13]  Thorsten Joachims,et al.  Learning structural SVMs with latent variables , 2009, ICML '09.

[14]  Vincent Ng,et al.  Supervised Models for Coreference Resolution , 2009, EMNLP.

[15]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[16]  Dan Klein,et al.  Coreference Resolution in a Modular, Entity-Centered Model , 2010, NAACL.

[17]  Michael Strube,et al.  End-to-End Coreference Resolution via Hypergraph Partitioning , 2010, COLING.

[18]  Vincent Ng,et al.  Narrowing the Modeling Gap: A Cluster-Ranking Approach to Coreference Resolution , 2014, J. Artif. Intell. Res..

[19]  Geoffrey E. Hinton,et al.  Visualizing non-metric similarities in multiple maps , 2011, Machine Learning.

[20]  Geoffrey J. Gordon,et al.  A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.

[21]  Veselin Stoyanov,et al.  Easy-first Coreference Resolution , 2012, COLING.

[22]  Eraldo Rezende Fernandes,et al.  Latent Structure Perceptron with Feature Induction for Unrestricted Coreference Resolution , 2012, EMNLP-CoNLL Shared Task.

[23]  Yuchen Zhang,et al.  CoNLL-2012 Shared Task: Modeling Multilingual Unrestricted Coreference in OntoNotes , 2012, EMNLP-CoNLL Shared Task.

[24]  Dan Klein,et al.  Easy Victories and Uphill Battles in Coreference Resolution , 2013, EMNLP.

[25]  Dan Klein,et al.  Error-Driven Analysis of Challenges in Coreference Resolution , 2013, EMNLP.

[26]  Dan Roth,et al.  A Constrained Latent Variable Model for Coreference Resolution , 2013, EMNLP.

[27]  Dan Klein,et al.  A Joint Model for Entity Analysis: Coreference, Typing, and Linking , 2014, TACL.

[28]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[29]  Xiaoqiang Luo,et al.  Scoring Coreference Partitions of Predicted Mentions: A Reference Implementation , 2014, ACL.

[30]  Xiaoqiang Luo,et al.  An Extension of BLANC to System Mentions , 2014, ACL.

[31]  Jonas Kuhn,et al.  Learning Structured Perceptrons for Coreference Resolution with Latent Antecedents and Non-local Features , 2014, ACL.

[32]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[33]  Wojciech Zaremba,et al.  Recurrent Neural Network Regularization , 2014, ArXiv.

[34]  Yang Wang,et al.  rnn : Recurrent Library for Torch , 2015, ArXiv.

[35]  Michael Strube,et al.  Analyzing and Visualizing Coreference Resolution Errors , 2015, HLT-NAACL.

[36]  Michael Strube,et al.  Latent Structures for Coreference Resolution , 2015, TACL.

[37]  Jason Weston,et al.  Learning Anaphoricity and Antecedent Ranking Features for Coreference Resolution , 2015, ACL.

[38]  Noah A. Smith,et al.  Transition-Based Dependency Parsing with Stack Long Short-Term Memory , 2015, ACL.

[39]  Christopher D. Manning,et al.  Entity-Centric Coreference Resolution with Model Stacking , 2015, ACL.

[40]  Samy Bengio,et al.  Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks , 2015, NIPS.

[41]  Dan Roth,et al.  A Joint Framework for Coreference Resolution and Mention Head Detection , 2015, CoNLL.

[42]  Xinlei Chen,et al.  Visualizing and Understanding Neural Models in NLP , 2015, NAACL.