Learning Structured Perceptrons for Coreference Resolution with Latent Antecedents and Non-local Features

We investigate different ways of learning structured perceptron models for coreference resolution when using non-local features and beam search. Our experimental results indicate that standard techniques such as early updates or Learning as Search Optimization (LaSO) perform worse than a greedy baseline that only uses local features. By modifying LaSO to delay updates until the end of each instance we obtain significant improvements over the baseline. Our model obtains the best results to date on recent shared task data for Arabic, Chinese, and English.

[1]  Lynette Hirschman,et al.  A Model-Theoretic Coreference Scoring Scheme , 1995, MUC.

[2]  Breck Baldwin,et al.  Algorithms for Scoring Coreference Chains , 1998 .

[3]  Danqi Chen,et al.  of the Association for Computational Linguistics: , 2001 .

[4]  Hwee Tou Ng,et al.  A Machine Learning Approach to Coreference Resolution of Noun Phrases , 2001, CL.

[5]  Claire Gardent,et al.  Improving Machine Learning Approaches to Coreference Resolution , 2002, ACL.

[6]  Michael Collins,et al.  Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms , 2002, EMNLP.

[7]  Koby Crammer,et al.  Online Passive-Aggressive Algorithms , 2003, J. Mach. Learn. Res..

[8]  Brian Roark,et al.  Incremental Parsing with the Perceptron Algorithm , 2004, ACL.

[9]  Xiaoqiang Luo,et al.  A Mention-Synchronous Coreference Resolution Algorithm Based On the Bell Tree , 2004, ACL.

[10]  Daniel Marcu,et al.  A Large-Scale Exploration of Effective Global Features for a Joint Entity Detection and Tracking Model , 2005, HLT.

[11]  Daniel Marcu,et al.  Learning as search optimization: approximate large margin methods for structured prediction , 2005, ICML.

[12]  Xiaoqiang Luo,et al.  On Coreference Resolution Performance Metrics , 2005, HLT.

[13]  Dekang Lin,et al.  Bootstrapping Path-Based Pronoun Resolution , 2006, ACL.

[14]  Andrew McCallum,et al.  First-Order Probabilistic Models for Coreference Resolution , 2007, NAACL.

[15]  Dan Roth,et al.  Understanding the Value of Features for Coreference Resolution , 2008, EMNLP.

[16]  Mirella Lapata,et al.  Modeling Local Coherence: An Entity-Based Approach , 2005, ACL.

[17]  Stephen Clark,et al.  A Tale of Two Parsers: Investigating and Combining Graph-based and Transition-based Dependency Parsing , 2008, EMNLP.

[18]  Liang Huang,et al.  Forest Reranking: Discriminative Parsing with Non-Local Features , 2008, ACL.

[19]  Vincent Ng,et al.  Unsupervised Models for Coreference Resolution , 2008, EMNLP.

[20]  John Langford,et al.  Search-based structured prediction , 2009, Machine Learning.

[21]  Claire Cardie,et al.  Conundrums in Noun Phrase Coreference Resolution: Making Sense of the State-of-the-Art , 2009, ACL.

[22]  Thorsten Joachims,et al.  Learning structural SVMs with latent variables , 2009, ICML '09.

[23]  Pascal Denis,et al.  Global joint models for coreference resolution and named entity classification , 2009, Proces. del Leng. Natural.

[24]  Dan Roth,et al.  Design Challenges and Misconceptions in Named Entity Recognition , 2009, CoNLL.

[25]  Vincent Ng,et al.  Supervised Models for Coreference Resolution , 2009, EMNLP.

[26]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[27]  Bernd Bohnet,et al.  Top Accuracy and Fast Dependency Parsing is not a Contradiction , 2010, COLING.

[28]  Vincent Ng,et al.  Supervised Noun Phrase Coreference Research: The First Fifteen Years , 2010, ACL.

[29]  Benoît Favre,et al.  StuMaBa : From Deep Representation to Surface , 2011, ENLG.

[30]  Geoffrey J. Gordon,et al.  A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.

[31]  Dan Roth,et al.  Illinois-Coref: The UI System in the CoNLL-2012 Shared Task , 2012, EMNLP-CoNLL Shared Task.

[32]  Chen Chen,et al.  Combining the Best of Two Worlds: A Hybrid Approach to Multilingual Coreference Resolution , 2012, EMNLP-CoNLL Shared Task.

[33]  Veselin Stoyanov,et al.  Easy-first Coreference Resolution , 2012, COLING.

[34]  Eraldo Rezende Fernandes,et al.  Latent Structure Perceptron with Feature Induction for Unrestricted Coreference Resolution , 2012, EMNLP-CoNLL Shared Task.

[35]  Yang Guo,et al.  Structured Perceptron with Inexact Search , 2012, NAACL.

[36]  Richárd Farkas,et al.  Data-driven Multilingual Coreference Resolution using Resolver Stacking , 2012, EMNLP-CoNLL Shared Task.

[37]  Yuchen Zhang,et al.  CoNLL-2012 Shared Task: Modeling Multilingual Unrestricted Coreference in OntoNotes , 2012, EMNLP-CoNLL Shared Task.

[38]  Dan Klein,et al.  Easy Victories and Uphill Battles in Coreference Resolution , 2013, EMNLP.

[39]  Dan Roth,et al.  A Constrained Latent Variable Model for Coreference Resolution , 2013, EMNLP.