Assertion modeling and its role in clinical phenotype identification

This paper describes an approach to assertion classification and an empirical study on the impact this task has on phenotype identification, a real world application in the clinical domain. The task of assertion classification is to assign to each medical concept mentioned in a clinical report (e.g., pneumonia, chest pain) a specific assertion category (e.g., present, absent, and possible). To improve the classification of medical assertions, we propose several new features that capture the semantic properties of special cue words highly indicative of a specific assertion category. The results obtained outperform the current state-of-the-art results for this task. Furthermore, we confirm the intuition that assertion classification contributes in significantly improving the results of phenotype identification from free-text clinical records.

[1]  Yiming Yang,et al.  A Comparative Study on Feature Selection in Text Categorization , 1997, ICML.

[2]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[3]  Joel D. Martin,et al.  Machine-learned solutions for three stages of clinical information extraction: the state of the art at i2b2 2010 , 2011, J. Am. Medical Informatics Assoc..

[4]  Roser Morante,et al.  Proceedings of the Workshop on Negation and Speculation in Natural Language Processing, NeSp-NLP@ACL 2010, Uppsala, Sweden, July 10, 2010 , 2010, NeSp-NLP@ACL.

[5]  Dan I. Moldovan,et al.  Semantic Representation of Negation Using Focus Detection , 2011, ACL.

[6]  Jan Svartvik,et al.  A __ comprehensive grammar of the English language , 1988 .

[7]  Özlem Uzuner,et al.  Does negation really matter? , 2010, NeSp-NLP@ACL.

[8]  Alan R. Aronson,et al.  Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program , 2001, AMIA.

[9]  Peter L. Elkin,et al.  A controlled trial of automated classification of negation from clinical notes , 2005, BMC Medical Informatics Decis. Mak..

[10]  János Csirik,et al.  The BioScope corpus: biomedical texts annotated for uncertainty, negation and their scopes , 2008, BMC Bioinformatics.

[11]  Dunja Mladenic,et al.  Feature selection on hierarchy of web documents , 2003, Decis. Support Syst..

[12]  Cosmin Adrian Bejan,et al.  Pneumonia identification using statistical feature selection , 2012, J. Am. Medical Informatics Assoc..

[13]  Ilya M. Goldin,et al.  Learning to Detect Negation with ‘Not’ in Medical Texts , 2003 .

[14]  S. T. Buckland,et al.  Computer-Intensive Methods for Testing Hypotheses. , 1990 .

[15]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[16]  David Tresner-Kirsch,et al.  MITRE system for clinical assertion status classification , 2011, J. Am. Medical Informatics Assoc..

[17]  Fei Xia,et al.  Identifying Patients with Pneumonia from Free-Text Intensive Care Unit Reports , 2011 .

[18]  Shuying Shen,et al.  2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text , 2011, J. Am. Medical Informatics Assoc..

[19]  Yang Huang,et al.  A novel hybrid approach to automated negation detection in clinical radiology reports. , 2007, Journal of the American Medical Informatics Association : JAMIA.

[20]  Özlem Uzuner,et al.  Machine learning and rule-based approaches to assertion classification. , 2009, Journal of the American Medical Informatics Association : JAMIA.

[21]  Jianfeng Gao,et al.  MSR SPLAT, a language analysis toolkit , 2012, HLT-NAACL.

[22]  Ellen Riloff,et al.  Improving Classification of Medical Assertions in Clinical Notes , 2011, ACL.

[23]  Olivier Bodenreider,et al.  The Unified Medical Language System (UMLS): integrating biomedical terminology , 2004, Nucleic Acids Res..

[24]  János Csirik,et al.  The CoNLL-2010 Shared Task: Learning to Detect Hedges and their Scope in Natural Language Text , 2010, CoNLL Shared Task.

[25]  Wendy W. Chapman,et al.  A Simple Algorithm for Identifying Negated Findings and Diseases in Discharge Summaries , 2001, J. Biomed. Informatics.

[26]  Prakash M. Nadkarni,et al.  Research Paper: Use of General-purpose Negation Detection to Augment Concept Indexing of Medical Documents: A Quantitative Study Using the UMLS , 2001, J. Am. Medical Informatics Assoc..

[27]  Wendy W. Chapman,et al.  ConText: An algorithm for determining negation, experiencer, and temporal status from clinical reports , 2009, J. Biomed. Informatics.

[28]  W. Bruce Croft,et al.  Research Paper: Ad Hoc Classification of Radiology Reports , 1999, J. Am. Medical Informatics Assoc..

[29]  Sanda M. Harabagiu,et al.  A flexible framework for deriving assertions from electronic medical records , 2011, J. Am. Medical Informatics Assoc..