Classifying tumor event attributes in radiology reports

Radiology reports contain vital diagnostic information that characterizes patient disease progression. However, information from reports is represented in free text, which is difficult to query against for secondary use. Automatic extraction of important information, such as tumor events using natural language processing, offers possibilities in improved clinical decision support, cohort identification, and retrospective evidence‐based research for cancer patients. The goal of this work was to classify tumor event attributes: negation, temporality, and malignancy, using biomedical ontology and linguistically enriched features. We report our results on an annotated corpus of 101 hepatocellular carcinoma patient radiology reports, and show that the improved classification improves overall template structuring. Classification performances for negation identification, past temporality classification, and malignancy classification were at 0.94, 0.62, and 0.77 F1, respectively. Incorporating the attributes into full templates led to an improvement of 0.72 F1 for tumor‐related events over a baseline of 0.65 F1. Improvement of negation, malignancy, and temporality classifications led to significant improvements in template extraction for the majority of categories. We present our machine‐learning approach to identifying these several tumor event attributes from radiology reports, as well as highlight challenges and areas for improvement.

[1]  James W. Cooper,et al.  Automatically extracting cancer disease characteristics from pathology reports into a Disease Knowledge Representation Model , 2009, J. Biomed. Informatics.

[2]  Angel X. Chang,et al.  SUTime: A library for recognizing and normalizing time expressions , 2012, LREC.

[3]  Shuying Shen,et al.  2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text , 2011, J. Am. Medical Informatics Assoc..

[4]  Jon Patrick,et al.  Automatic population of structured reports from narrative pathology reports , 2014 .

[5]  Anna Rumshisky,et al.  Evaluating temporal relations in clinical text: 2012 i2b2 Challenge , 2013, J. Am. Medical Informatics Assoc..

[6]  Carol Friedman,et al.  A broad-coverage natural language processing system , 2000, AMIA.

[7]  Olivier Bodenreider,et al.  The Unified Medical Language System (UMLS): integrating biomedical terminology , 2004, Nucleic Acids Res..

[8]  Ramin Khorasani,et al.  Automated Extraction of BI-RADS Final Assessment Categories from Radiology Reports with Natural Language Processing , 2013, Journal of Digital Imaging.

[9]  James J. Masanz,et al.  Negation’s Not Solved: Generalizability Versus Optimizability in Clinical Natural Language Processing , 2014, PloS one.

[10]  Wendy W. Chapman,et al.  A Simple Algorithm for Identifying Negated Findings and Diseases in Discharge Summaries , 2001, J. Biomed. Informatics.

[11]  M Sevenster,et al.  Natural Language Processing Techniques for Extracting and Categorizing Finding Measurements in Narrative Radiology Reports , 2015, Applied Clinical Informatics.

[12]  Goran Nenadic,et al.  Text mining of cancer-related information: Review of current status and future directions , 2014, Int. J. Medical Informatics.

[13]  Christopher B. Jones,et al.  KneeTex: an ontology–driven system for information extraction from MRI reports , 2015, Journal of Biomedical Semantics.

[14]  F. Lai,et al.  Information extraction for tracking liver cancer patients' statuses: from mixture of clinical narrative report types. , 2013, Telemedicine journal and e-health : the official journal of the American Telemedicine Association.

[15]  Cosmin Adrian Bejan,et al.  Assertion modeling and its role in clinical phenotype identification , 2013, J. Biomed. Informatics.

[16]  Meliha Yetisgen-Yildiz,et al.  Tumor information extraction in radiology reports for hepatocellular carcinoma patients , 2016, CRI.

[17]  Craig A. Morioka,et al.  Automating the generation of lexical patterns for processing free text in clinical documents , 2015, J. Am. Medical Informatics Assoc..

[18]  Alan R. Aronson,et al.  An overview of MetaMap: historical perspective and recent advances , 2010, J. Am. Medical Informatics Assoc..

[19]  S. Soderland,et al.  Automatic structuring of radiology free-text reports. , 2001, Radiographics : a review publication of the Radiological Society of North America, Inc.

[20]  Joe Kesterson,et al.  Comparing methods for identifying pancreatic cancer patients using electronic data sources. , 2010, AMIA ... Annual Symposium proceedings. AMIA Symposium.

[21]  Anthony N. Nguyen,et al.  Assessing the Utility of Automatic Cancer Registry Notifications Data Extraction from Free-Text Pathology Reports , 2015, AMIA.