Same Referent, Different Words: Unsupervised Mining of Opaque Coreferent Mentions

Coreference resolution systems rely heavily on string overlap (e.g., Google Inc. and Google), performing badly on mentions with very different words (opaque mentions) like Google and the search giant. Yet prior attempts to resolve opaque pairs using ontologies or distributional semantics hurt precision more than improved recall. We present a new unsupervised method for mining opaque pairs. Our intuition is to restrict distributional semantics to articles about the same event, thus promoting referential match. Using an English comparable corpus of tech news, we built a dictionary of opaque coreferent mentions (only 3% are in WordNet). Our dictionary can be integrated into any coreference system (it increases the performance of a state-of-the-art system by 1% F1 on all measures) and is easily extendable by using news aggregators.

[1]  Heeyoung Lee,et al.  Stanford’s Multi-Pass Sieve Coreference Resolution System at the CoNLL-2011 Shared Task , 2011, CoNLL Shared Task.

[2]  Eduard H. Hovy,et al.  The Automated Acquisition of Topic Signatures for Text Summarization , 2000, COLING.

[3]  Satoshi Sekine,et al.  Paraphrase Acquisition for Information Extraction , 2003, IWP@ACL.

[4]  Sameer Pradhan,et al.  Unrestricted Coreference: Identifying Entities and Events in OntoNotes , 2007, International Conference on Semantic Computing (ICSC 2007).

[5]  Dan Klein,et al.  Coreference Semantics from Web Features , 2012, ACL.

[6]  Vincent Ng,et al.  Coreference Resolution with World Knowledge , 2011, ACL.

[7]  Chris Callison-Burch,et al.  Paraphrase Fragment Extraction from Monolingual Comparable Corpora , 2011, BUCC@ACL.

[8]  Daniel Marcu,et al.  A Large-Scale Exploration of Effective Global Features for a Joint Entity Detection and Tracking Model , 2005, HLT.

[9]  Dekang Lin,et al.  DIRT – Discovery of Inference Rules from Text , 2001 .

[10]  Hinrich Schütze,et al.  Bootstrapping coreference resolution using word associations , 2011, ACL.

[11]  Ellen Riloff,et al.  Unsupervised Learning of Contextual Role Knowledge for Coreference Resolution , 2004, NAACL.

[12]  Breck Baldwin,et al.  Algorithms for Scoring Coreference Chains , 1998 .

[13]  Nianwen Xue,et al.  CoNLL-2011 Shared Task: Modeling Unrestricted Coreference in OntoNotes , 2011, CoNLL Shared Task.

[14]  Massimo Poesio,et al.  Disambiguation and Filtering Methods in Using Web Knowledge for Coreference Resolution , 2011, FLAIRS.

[15]  Beth Levin,et al.  English Verb Classes and Alternations: A Preliminary Investigation , 1993 .

[16]  Lynette Hirschman,et al.  A Model-Theoretic Coreference Scoring Scheme , 1995, MUC.

[17]  Regina Barzilay,et al.  Extracting Paraphrases from a Parallel Corpus , 2001, ACL.

[18]  Olatz Ansa,et al.  Enriching WordNet concepts with topic signatures , 2001, ArXiv.

[19]  Jian Su,et al.  Coreference Resolution Using Semantic Relatedness Information from Automatically Discovered Patterns , 2007, ACL.

[20]  Dan Klein,et al.  Coreference Resolution in a Modular, Entity-Centered Model , 2010, NAACL.

[21]  Eduard H. Hovy,et al.  Coreference Resolution across Corpora: Languages, Coding Schemes, and Preprocessing Information , 2010, ACL.

[22]  Eduard H. Hovy,et al.  Offline Strategies for Online Question Answering: Answering Questions Before They Are Asked , 2003, ACL.

[23]  Vincent Ng,et al.  Shallow Semantics for Coreference Resolution , 2007, IJCAI.

[24]  Patrick Pantel,et al.  DIRT @SBT@discovery of inference rules from text , 2001, KDD '01.

[25]  Joakim Nivre,et al.  Memory-Based Dependency Parsing , 2004, CoNLL.

[26]  M. R E C A S E,et al.  BLANC: Implementing the Rand index for coreference evaluation , 2010, Natural Language Engineering.

[27]  Eraldo Rezende Fernandes,et al.  Latent Structure Perceptron with Feature Induction for Unrestricted Coreference Resolution , 2012, EMNLP-CoNLL Shared Task.

[28]  Xiaoqiang Luo,et al.  On Coreference Resolution Performance Metrics , 2005, HLT.

[29]  Pascal Denis,et al.  Global joint models for coreference resolution and named entity classification , 2009, Proces. del Leng. Natural.

[30]  Gerlof Bouma,et al.  Normalized (pointwise) mutual information in collocation extraction , 2009 .

[31]  Simone Paolo Ponzetto,et al.  Exploiting Semantic Role Labeling, WordNet and Wikipedia for Coreference Resolution , 2006, NAACL.

[32]  Claire Cardie,et al.  Conundrums in Noun Phrase Coreference Resolution: Making Sense of the State-of-the-Art , 2009, ACL.