Deep Learning to Classify Radiology Free-Text Reports.

Purpose To evaluate the performance of a deep learning convolutional neural network (CNN) model compared with a traditional natural language processing (NLP) model in extracting pulmonary embolism (PE) findings from thoracic computed tomography (CT) reports from two institutions. Materials and Methods Contrast material-enhanced CT examinations of the chest performed between January 1, 1998, and January 1, 2016, were selected. Annotations by two human radiologists were made for three categories: the presence, chronicity, and location of PE. Classification of performance of a CNN model with an unsupervised learning algorithm for obtaining vector representations of words was compared with the open-source application PeFinder. Sensitivity, specificity, accuracy, and F1 scores for both the CNN model and PeFinder in the internal and external validation sets were determined. Results The CNN model demonstrated an accuracy of 99% and an area under the curve value of 0.97. For internal validation report data, the CNN model had a statistically significant larger F1 score (0.938) than did PeFinder (0.867) when classifying findings as either PE positive or PE negative, but no significant difference in sensitivity, specificity, or accuracy was found. For external validation report data, no statistical difference between the performance of the CNN model and PeFinder was found. Conclusion A deep learning CNN model can classify radiology free-text reports with accuracy equivalent to or beyond that of an existing traditional NLP model. © RSNA, 2017 Online supplemental material is available for this article.

[1]  Carol Friedman,et al.  Research Paper: A General Natural-language Text Processor for Clinical Radiology , 1994, J. Am. Medical Informatics Assoc..

[2]  J. Austin,et al.  Use of natural language processing to translate clinical information from a database of 889,921 chest radiographic reports. , 2002, Radiology.

[3]  James H Thrall,et al.  Application of Recently Developed Computer Algorithm for Automatic Classification of Unstructured Radiology Reports: Validation Study 1 , 2004 .

[4]  Wendy W. Chapman,et al.  Document-level classification of CT pulmonary angiography reports based on an extension of the ConText algorithm , 2011, J. Biomed. Informatics.

[5]  M. Lungren,et al.  Physician self-referral of lumbar spine MRI with comparative analysis of negative study rates as a marker of utilization appropriateness. , 2012, AJR. American journal of roentgenology.

[6]  M. Lungren,et al.  Physician self-referral: frequency of negative findings at MR imaging of the knee as a marker of appropriate utilization. , 2013, Radiology.

[7]  M. Lungren,et al.  Journal Club: Shoulder MRI utilization: relationship of physician MRI equipment ownership to negative study frequency. , 2013, AJR. American journal of roentgenology.

[8]  Nigam H. Shah,et al.  Practice-Based Evidence: Profiling the Safety of Cilostazol by Text-Mining of Clinical Notes , 2013, PloS one.

[9]  B. Gallego,et al.  Role of electronic health records in comparative effectiveness research. , 2013, Journal of comparative effectiveness research.

[10]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[11]  Sheng Yu,et al.  Classification of CT pulmonary angiography reports by presence, chronicity, and location of pulmonary embolism with natural language processing , 2014, J. Biomed. Informatics.

[12]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[13]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[14]  Klaus-Robert Müller,et al.  Explaining Predictions of Non-Linear Classifiers in NLP , 2016, Rep4NLP@ACL.

[15]  Dimitrios Mitsouras,et al.  Natural Language Processing Technologies in Radiology Research and Clinical Applications. , 2016, Radiographics : a review publication of the Radiological Society of North America, Inc.

[16]  P. Lakhani,et al.  Deep Learning at Chest Radiography: Automated Classification of Pulmonary Tuberculosis by Using Convolutional Neural Networks. , 2017, Radiology.

[17]  Simone Palazzo,et al.  Deep learning for automated skeletal bone age assessment in X‐ray images , 2017, Medical Image Anal..

[18]  Heung-Il Suk,et al.  Deep Learning in Medical Image Analysis. , 2017, Annual review of biomedical engineering.

[19]  Yuan Luo,et al.  Recurrent Neural Networks for Classifying Relations in Clinical Notes , 2017, AMIA.

[20]  Jenny Lee,et al.  Fully Automated Deep Learning System for Bone Age Assessment , 2017, Journal of Digital Imaging.

[21]  C. Langlotz,et al.  Performance of a Machine Learning Classifier of Knee MRI Reports in Two Large Academic Radiology Practices: A Tool to Estimate Diagnostic Yield. , 2017, AJR. American journal of roentgenology.