Phrase Structures and Dependencies for End-to-End Coreference Resolution

We present experiments in data-driven coreference resolution comparing the effect of different syntactic representations provided as features in the coreference classification step: no syntax, phrase structure representations, dependency representations, and combinations of the representation types. We compare the end-to-end performance of a parametrized state-of-the-art coreference resolution system on the English data from the CoNLL 2012 shared task. On their own, phrase structures are more useful than dependencies, but the combinations yield highest performance and a significant improvement on the resolution of pronouns. Enriching phrase structure with dependency trees obtained from an independent parser is most helpful, but an extension of the predicted phrase structure using just pattern-based phraseto-dependency conversion seems to provide signals for the machine learning that cannot be distilled from phrase structure alone (despite intense feature selection). This is an interesting result for a highly configurational language: It is easier to learn generalizations over grammatical constraints on coreference when grammatical relations are explicitly provided.

[1]  Breck Baldwin,et al.  Algorithms for Scoring Coreference Chains , 1998 .

[2]  Mitchell P. Marcus,et al.  OntoNotes: The 90% Solution , 2006, NAACL.

[3]  Yannick Versley,et al.  SemEval-2010 Task 1: Coreference Resolution in Multiple Languages , 2009, *SEMEVAL.

[4]  Noam Chomsky,et al.  Lectures on Government and Binding , 1981 .

[5]  Christopher D. Manning,et al.  Generating Typed Dependency Parses from Phrase Structure Parses , 2006, LREC.

[6]  Xiaoqiang Luo,et al.  On Coreference Resolution Performance Metrics , 2005, HLT.

[7]  Nianwen Xue,et al.  CoNLL-2011 Shared Task: Modeling Unrestricted Coreference in OntoNotes , 2011, CoNLL Shared Task.

[8]  Martha Palmer,et al.  Robust Constituent-to-Dependency Conversion for English , 2010 .

[9]  Hwee Tou Ng,et al.  A Machine Learning Approach to Coreference Resolution of Noun Phrases , 2001, CL.

[10]  Richárd Farkas,et al.  Data-driven Multilingual Coreference Resolution using Resolver Stacking , 2012, EMNLP-CoNLL Shared Task.

[11]  Bernd Bohnet,et al.  Top Accuracy and Fast Dependency Parsing is not a Contradiction , 2010, COLING.

[12]  Lynette Hirschman,et al.  A Model-Theoretic Coreference Scoring Scheme , 1995, MUC.

[13]  Yuchen Zhang,et al.  CoNLL-2012 Shared Task: Modeling Multilingual Unrestricted Coreference in OntoNotes , 2012, EMNLP-CoNLL Shared Task.

[14]  C. Pollard Anhaphors in English and the scope of binding theory , 1992 .

[15]  Mary Dalrymple,et al.  The syntax of anaphoric binding , 1993 .

[16]  Claire Gardent,et al.  Improving Machine Learning Approaches to Coreference Resolution , 2002, ACL.