Joint inference for end-to-end coreference resolution for clinical notes

Recent US government initiatives have led to wide adoption of Electronic Health Records (EHRs). More and more health care institutions are storing patients' data in an electronic format. These EHRs contain valuable information which can be used in important applications like Clinical Decision Support (CDS). So, Information Extraction (IE) from EHRs is a very promising research area. This paper presents a robust method for end-to-end coreference resolution for clinical narratives. For our experiments, we used the datasets provided by i2b2/VA team as part of i2b2/VA 2011 shared task on coreference resolution. One part of this data was annotated according to ODIE guidelines and another part was annotated according to i2b2 guidelines. We designed a global inference strategy for end-to-end coreference resolution which jointly determines the mention types and coreference relations between them. This technique avoids the problem of error-propagation which is common in pipeline systems. For pronominal resolution, we developed different strategies for resolving different pronouns. We report the best results to date on both ODIE and i2b2 data. We got the best results for both types of cases: (1) where gold mentions are already given and (2) for end-to-end coreference resolution. ODIE and i2b2 data are annotated quite differently. Best results on both types of data proves the robustness of our algorithm.

[1]  Prateek Jindal,et al.  Information extraction for clinical narratives , 2013 .

[2]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[3]  Heeyoung Lee,et al.  A Multi-Pass Sieve for Coreference Resolution , 2010, EMNLP.

[4]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[5]  Peter Szolovits,et al.  MCORES: a system for noun phrase coreference resolution for clinical records , 2012, J. Am. Medical Informatics Assoc..

[6]  Alan R. Aronson,et al.  An overview of MetaMap: historical perspective and recent advances , 2010, J. Am. Medical Informatics Assoc..

[7]  Wen-Lian Hsu,et al.  Coreference resolution of medical concepts in discharge summaries by exploiting contextual information , 2012, J. Am. Medical Informatics Assoc..

[8]  Hwee Tou Ng,et al.  A Machine Learning Approach to Coreference Resolution of Noun Phrases , 2001, CL.

[9]  Yue Wang,et al.  A classification approach to coreference in discharge summaries: 2011 i2b2 challenge , 2012, J. Am. Medical Informatics Assoc..

[10]  Claire Cardie,et al.  Identifying Anaphoric and Non-Anaphoric Noun Phrases to Improve Coreference Resolution , 2002, COLING.

[11]  C. D. Paice,et al.  Towards the automatic recognition of anaphoric features in English text: the impersonal pronoun “it” , 1987 .

[12]  Wendy W. Chapman,et al.  Anaphoric relations in the clinical narrative: corpus creation , 2011, J. Am. Medical Informatics Assoc..

[13]  Shuying Shen,et al.  Evaluating the state of the art in coreference resolution for electronic medical records , 2012, J. Am. Medical Informatics Assoc..

[14]  Lynette Hirschman,et al.  A Model-Theoretic Coreference Scoring Scheme , 1995, MUC.

[15]  Dan Roth,et al.  Inference Protocols for Coreference Resolution , 2011, CoNLL Shared Task.

[16]  Oussama El-Rawas,et al.  Machine learning-based coreference resolution of concepts in clinical documents , 2012, J. Am. Medical Informatics Assoc..

[17]  Dan Roth,et al.  Understanding the Value of Features for Coreference Resolution , 2008, EMNLP.

[18]  Dan Roth,et al.  Extraction of events and temporal expressions from clinical narratives , 2013, J. Biomed. Informatics.

[19]  Pedro M. Domingos,et al.  Joint Unsupervised Coreference Resolution with Markov Logic , 2008, EMNLP.

[20]  Dan Roth,et al.  End-to-End Coreference Resolution for Clinical Narratives , 2013, IJCAI.

[21]  Dan Roth,et al.  Learning-based Multi-Sieve Co-reference Resolution with Knowledge , 2012, EMNLP-CoNLL.

[22]  Breck Baldwin,et al.  Algorithms for Scoring Coreference Chains , 1998 .

[23]  Dan Roth,et al.  Using domain knowledge and domain-inspired discourse model for coreference resolution for clinical narratives , 2013, J. Am. Medical Informatics Assoc..

[24]  Vincent Ng,et al.  Coreference Resolution with World Knowledge , 2011, ACL.

[25]  Luciano Serafini,et al.  Using Background Knowledge to Support Coreference Resolution , 2010, ECAI.

[26]  B. E. Eckbo,et al.  Appendix , 1826, Epilepsy Research.

[27]  Xiaoqiang Luo,et al.  On Coreference Resolution Performance Metrics , 2005, HLT.

[28]  Pascal Denis,et al.  Global joint models for coreference resolution and named entity classification , 2009, Proces. del Leng. Natural.

[29]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[30]  Abdul V. Roudsari,et al.  Lexical patterns, features and knowledge resources for coreference resolution in clinical notes , 2012, J. Biomed. Informatics.

[31]  Dan Roth,et al.  Using Knowledge and Constraints To Find the Best Antecedent , 2012, COLING.

[32]  Dan Roth,et al.  Using Soft Constraints in Joint Inference for Clinical Concept Recognition , 2013, EMNLP.

[33]  Claire Gardent,et al.  Improving Machine Learning Approaches to Coreference Resolution , 2002, ACL.