Using Knowledge and Constraints To Find the Best Antecedent

Coreference resolution is the problem of clustering mentions into entities and is very critical for natural language understanding. This paper studies the problem of coreference resolution in the context of the newly emerging domain of Electronic Health Records (EHRs). The commonly used “best-link” model for coreference resolution considers only the scores from a pairwise classifier in selecting the best antecedent. In this paper, we extend this model to include several constraints derived from surface-form of the mentions and the context in which they appear. Another major contribution of this paper is to show the use of domain-specific knowledge sources, mention parsing and clinical descriptors in deriving features which contribute to improved coreference resolution performance. We present experiments on 4 different clinical datasets illustrating that our approach outperforms a strong baseline and a state-of-the-art system by a wide margin.

[1]  Xiaoqiang Luo,et al.  On Coreference Resolution Performance Metrics , 2005, HLT.

[2]  Vincent Ng,et al.  Coreference Resolution with World Knowledge , 2011, ACL.

[3]  Chen Lin,et al.  A system for coreference resolution for the clinical narrative , 2012, J. Am. Medical Informatics Assoc..

[4]  Heeyoung Lee,et al.  A Multi-Pass Sieve for Coreference Resolution , 2010, EMNLP.

[5]  Nianwen Xue,et al.  CoNLL-2011 Shared Task: Modeling Unrestricted Coreference in OntoNotes , 2011, CoNLL Shared Task.

[6]  Shuying Shen,et al.  Evaluating the state of the art in coreference resolution for electronic medical records , 2012, J. Am. Medical Informatics Assoc..

[7]  Ellen Riloff,et al.  Unsupervised Learning of Contextual Role Knowledge for Coreference Resolution , 2004, NAACL.

[8]  Wendy W. Chapman,et al.  Coreference resolution: A review of general methodologies and applications in the clinical domain , 2011, J. Biomed. Informatics.

[9]  Philipp Koehn,et al.  Statistical Significance Tests for Machine Translation Evaluation , 2004, EMNLP.

[10]  Breck Baldwin,et al.  Algorithms for Scoring Coreference Chains , 1998 .

[11]  Massimo Poesio,et al.  Disambiguation and Filtering Methods in Using Web Knowledge for Coreference Resolution , 2011, FLAIRS.

[12]  Peter Szolovits,et al.  MCORES: a system for noun phrase coreference resolution for clinical records , 2012, J. Am. Medical Informatics Assoc..

[13]  Dan Roth,et al.  Inference Protocols for Coreference Resolution , 2011, CoNLL Shared Task.

[14]  Lynette Hirschman,et al.  A Model-Theoretic Coreference Scoring Scheme , 1995, MUC.

[15]  Claire Gardent,et al.  Improving Machine Learning Approaches to Coreference Resolution , 2002, ACL.

[16]  Sanda M. Harabagiu,et al.  A supervised framework for resolving coreference in clinical records , 2012, J. Am. Medical Informatics Assoc..

[17]  Alan R. Aronson,et al.  An overview of MetaMap: historical perspective and recent advances , 2010, J. Am. Medical Informatics Assoc..

[18]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[19]  Yue Wang,et al.  A classification approach to coreference in discharge summaries: 2011 i2b2 challenge , 2012, J. Am. Medical Informatics Assoc..

[20]  Simone Paolo Ponzetto,et al.  Exploiting Semantic Role Labeling, WordNet and Wikipedia for Coreference Resolution , 2006, NAACL.

[21]  Dan Klein,et al.  Simple Coreference Resolution with Rich Syntactic and Semantic Features , 2009, EMNLP.

[22]  Luciano Serafini,et al.  Using Background Knowledge to Support Coreference Resolution , 2010, ECAI.

[23]  Andrew McCallum,et al.  First-Order Probabilistic Models for Coreference Resolution , 2007, NAACL.

[24]  Christopher D. Manning,et al.  Enforcing Transitivity in Coreference Resolution , 2008, ACL.

[25]  Dan Klein,et al.  Unsupervised Coreference Resolution in a Nonparametric Bayesian Model , 2007, ACL.

[26]  Vincent Ng,et al.  Supervised Noun Phrase Coreference Research: The First Fifteen Years , 2010, ACL.

[27]  Dan Roth,et al.  Using domain knowledge and domain-inspired discourse model for coreference resolution for clinical narratives , 2013, J. Am. Medical Informatics Assoc..

[28]  Pedro M. Domingos,et al.  Joint Unsupervised Coreference Resolution with Markov Logic , 2008, EMNLP.

[29]  Oussama El-Rawas,et al.  Machine learning-based coreference resolution of concepts in clinical documents , 2012, J. Am. Medical Informatics Assoc..

[30]  Dan Roth,et al.  Understanding the Value of Features for Coreference Resolution , 2008, EMNLP.