Learning Anaphoricity and Antecedent Ranking Features for Coreference Resolution

We introduce a simple, non-linear mention-ranking model for coreference resolution that attempts to learn distinct feature representations for anaphoricity detection and antecedent ranking, which we encourage by pre-training on a pair of corresponding subtasks. Although we use only simple, unconjoined features, the model is able to learn useful representations, and we report the best overall score on the CoNLL 2012 English test set to date.

[1]  Xiaoqiang Luo,et al.  On Coreference Resolution Performance Metrics , 2005, HLT.

[2]  Eraldo Rezende Fernandes,et al.  Latent Structure Perceptron with Feature Induction for Unrestricted Coreference Resolution , 2012, EMNLP-CoNLL Shared Task.

[3]  Andrew McCallum,et al.  First-Order Probabilistic Models for Coreference Resolution , 2007, NAACL.

[4]  Geoffrey E. Hinton,et al.  On the importance of initialization and momentum in deep learning , 2013, ICML.

[5]  Thorsten Joachims,et al.  Learning structural SVMs with latent variables , 2009, ICML '09.

[6]  Pascal Denis,et al.  Specialized Models and Ranking for Coreference Resolution , 2008, EMNLP.

[7]  Xiaoqiang Luo,et al.  Scoring Coreference Partitions of Predicted Mentions: A Reference Implementation , 2014, ACL.

[8]  Pascal Denis,et al.  Joint Determination of Anaphoricity and Coreference Resolution using Integer Programming , 2007, NAACL.

[9]  Breck Baldwin,et al.  Algorithms for Scoring Coreference Chains , 1998 .

[10]  Y. Nesterov A method for solving the convex programming problem with convergence rate O(1/k^2) , 1983 .

[11]  Jeffrey Pennington,et al.  Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions , 2011, EMNLP.

[12]  Yoshua Bengio,et al.  Why Does Unsupervised Pre-training Help Deep Learning? , 2010, AISTATS.

[13]  Yonatan Belinkov,et al.  Exploring Compositional Architectures and Word Vector Representations for Prepositional Phrase Attachment , 2014, Transactions of the Association for Computational Linguistics.

[14]  Emmanuel Lassalle,et al.  Improving pairwise coreference models through feature space hierarchy learning , 2013, ACL.

[15]  Heeyoung Lee,et al.  Stanford’s Multi-Pass Sieve Coreference Resolution System at the CoNLL-2011 Shared Task , 2011, CoNLL Shared Task.

[16]  Richárd Farkas,et al.  Data-driven Multilingual Coreference Resolution using Resolver Stacking , 2012, EMNLP-CoNLL Shared Task.

[17]  Christopher Potts,et al.  The Life and Death of Discourse Entities: Identifying Singleton Mentions , 2013, NAACL.

[18]  Vincent Ng,et al.  Learning Noun Phrase Anaphoricity to Improve Conference Resolution: Issues in Representation and Optimization , 2004, ACL.

[19]  Jonas Kuhn,et al.  Learning Structured Perceptrons for Coreference Resolution with Latent Antecedents and Non-local Features , 2014, ACL.

[20]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[21]  Yuchen Zhang,et al.  CoNLL-2012 Shared Task: Modeling Multilingual Unrestricted Coreference in OntoNotes , 2012, EMNLP-CoNLL Shared Task.

[22]  Andrew McCallum,et al.  Toward Conditional Models of Identity Uncertainty with Application to Proper Noun Coreference , 2003, IIWeb.

[23]  Dan Roth,et al.  Understanding the Value of Features for Coreference Resolution , 2008, EMNLP.

[24]  Heeyoung Lee,et al.  Deterministic Coreference Resolution Based on Entity-Centric, Precision-Ranked Rules , 2013, CL.

[25]  Dan Klein,et al.  A Joint Model for Entity Analysis: Coreference, Typing, and Linking , 2014, TACL.

[26]  Dan Roth,et al.  A Constrained Latent Variable Model for Coreference Resolution , 2013, EMNLP.

[27]  Xiaoqiang Luo,et al.  An Extension of BLANC to System Mentions , 2014, ACL.

[28]  Thomas G. Dietterich,et al.  Prune-and-Score: Learning for Greedy Coreference Resolution , 2014, EMNLP.

[29]  Philipp Koehn,et al.  Statistical Significance Tests for Machine Translation Evaluation , 2004, EMNLP.

[30]  Mitchell P. Marcus,et al.  OntoNotes: The 90% Solution , 2006, NAACL.

[31]  Dan Klein,et al.  Easy Victories and Uphill Battles in Coreference Resolution , 2013, EMNLP.

[32]  Andrew Y. Ng,et al.  Semantic Compositionality through Recursive Matrix-Vector Spaces , 2012, EMNLP.

[33]  Andrew McCallum,et al.  A Discriminative Hierarchical Model for Fast Coreference at Large Scale , 2012, ACL.

[34]  Dan Klein,et al.  Error-Driven Analysis of Challenges in Coreference Resolution , 2013, EMNLP.

[35]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[36]  Dan Roth,et al.  Illinois-Coref: The UI System in the CoNLL-2012 Shared Task , 2012, EMNLP-CoNLL Shared Task.

[37]  Vincent Ng,et al.  Unsupervised Models for Coreference Resolution , 2008, EMNLP.

[38]  Veselin Stoyanov,et al.  Easy-first Coreference Resolution , 2012, COLING.

[39]  Lynette Hirschman,et al.  A Model-Theoretic Coreference Scoring Scheme , 1995, MUC.

[40]  Hwee Tou Ng,et al.  A Machine Learning Approach to Coreference Resolution of Noun Phrases , 2001, CL.

[41]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[42]  Sameer Pradhan Proceedings of the Fifteenth Conference on Computational Natural Language Learning: Shared Task, CoNLL 2011, Portland, Oregon, USA, June 23-24, 2011 , 2011, CoNLL Shared Task.

[43]  Claire Cardie,et al.  Identifying Anaphoric and Non-Anaphoric Noun Phrases to Improve Coreference Resolution , 2002, COLING.

[44]  Vincent Ng,et al.  Supervised Models for Coreference Resolution , 2009, EMNLP.

[45]  Dan Klein,et al.  Coreference Resolution in a Modular, Entity-Centered Model , 2010, NAACL.