Identification of Patients with Acute Lung Injury from Free-Text Chest X-Ray Reports

Identification of complex clinical phenotypes among critically ill patients is a major challenge in clinical research. The overall research goal of our work is to develop automated approaches that accurately identify critical illness phenotypes to prevent the resource intensive manual abstraction approach. In this paper, we describe a text processing method that uses Natural Language Processing (NLP) and supervised text classification methods to identify patients who are positive for Acute Lung Injury (ALI) based on the information available in free-text chest x-ray reports. To increase the classification performance we enhanced the baseline unigram representation with bigram and trigram features, enriched the n-gram features with assertion analysis, and applied statistical feature selection. We used 10-fold cross validation for evaluation and our best performing classifier achieved 81.70% precision (positive predictive value), 75.59% recall (sensitivity), 78.53% f-score, 74.61% negative predictive value, 76.80% specificity in identifying patients with ALI.

[1]  Fei Xia,et al.  Modeling annotator rationales with application to pneumonia classification , 2013, AAAI 2013.

[2]  Shuying Shen,et al.  2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text , 2011, J. Am. Medical Informatics Assoc..

[3]  Wenqian Shang,et al.  A novel feature selection algorithm for text categorization , 2007, Expert Syst. Appl..

[4]  Clement J. McDonald,et al.  What can natural language processing do for clinical decision support? , 2009, J. Biomed. Informatics.

[5]  Andrés Esteban,et al.  Acute respiratory distress syndrome: Underrecognition by clinicians and diagnostic accuracy of three clinical definitions* , 2005, Critical care medicine.

[6]  G. Rubenfeld,et al.  Barriers to providing lung-protective ventilation to patients with acute lung injury , 2004, Critical care medicine.

[7]  Cosmin Adrian Bejan,et al.  Pneumonia identification using statistical feature selection , 2012, J. Am. Medical Informatics Assoc..

[8]  M. Grocott,et al.  Acute respiratory distress syndrome and acute lung injury , 2011, Postgraduate Medical Journal.

[9]  Peter J. Haug,et al.  Combining decision support methodologies to diagnose pneumonia , 2001, AMIA.

[10]  Cosmin Adrian Bejan,et al.  Assertion modeling and its role in clinical phenotype identification , 2013, J. Biomed. Informatics.

[11]  Arthur S Slutsky,et al.  Acute Respiratory Distress Syndrome The Berlin Definition , 2012 .

[12]  Imre Solti1,et al.  Automated classification of radiology reports for acute lung injury: Comparison of keyword and machine learning based natural language processing approaches , 2009, 2009 IEEE International Conference on Bioinformatics and Biomedicine Workshop.

[13]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[14]  C. Goss,et al.  Genetic variation in the FAS gene and associations with acute lung injury. , 2011, American journal of respiratory and critical care medicine.

[15]  Adam L. Berger,et al.  A Maximum Entropy Approach to Natural Language Processing , 1996, CL.

[16]  Diane P. Martin,et al.  Incidence and outcomes of acute lung injury. , 2005, The New England journal of medicine.

[17]  Paul N. Lanken,et al.  Research Paper: Validation Study of an Automated Electronic Acute Lung Injury Screening Tool , 2009, J. Am. Medical Informatics Assoc..

[18]  Peter J. Haug,et al.  Research Paper: Automatic Detection of Acute Bacterial Pneumonia from Chest X-ray Reports , 2000, J. Am. Medical Informatics Assoc..

[19]  Vitaly Herasevich,et al.  Validation of an electronic surveillance system for acute lung injury , 2009, Intensive Care Medicine.

[20]  Fei Xia,et al.  Annotating Change of State for Clinical Events , 2013, EVENTS@NAACL-HLT.

[21]  G R Bernard,et al.  The American-European Consensus Conference on ARDS, part 2: Ventilatory, pharmacologic, supportive therapy, study design strategies, and issues related to recovery and remodeling. Acute respiratory distress syndrome. , 1998, American journal of respiratory and critical care medicine.

[22]  Publisher Bioinfo Publications Journal of Computational Linguistics , 2013 .

[23]  S. Trent Rosenbloom,et al.  NLP-based Identification of Pneumonia Cases from Free-Text Radiological Reports , 2008, AMIA.

[24]  Lucy Vanderwende,et al.  Statistical Section Segmentation in Free-Text Clinical Records , 2012, LREC.