Detection of unexpected findings in radiology reports: A comparative study of machine learning approaches

Abstract This study explores machine learning methods for the detection of unexpected findings in Spanish radiology reports. Regarding radiological reports, unexpected findings are the set of radiological signs identified at a certain imaging modality exam which meet two characteristics: they are not apparently related with the a priori expected results of the radiological exam and involve a clinical emergency or urgency situation that must be reported shortly to the prescribing physician or another medical specialist as well as to the patient in order to preserve life and/or prevent dangerous occurrences. Several traditional machine learning and deep learning classification algorithms are evaluated and compared. To carry out the task we use 5947 anonymous radiology reports from HT medica. Experimental results suggest that the performance of the Convolutional Neural Networks models are better than traditional machine learning. The best F1 score for the identification of an unexpected finding was 90%. Finally, we also perform an error analysis which will guide us to achieve better results in the future.

[1]  Xiaohong W. Gao,et al.  Classification of CT brain images based on deep learning networks , 2017, Comput. Methods Programs Biomed..

[2]  P. Lakhani,et al.  Deep Learning at Chest Radiography: Automated Classification of Pulmonary Tuberculosis by Using Convolutional Neural Networks. , 2017, Radiology.

[3]  James H Thrall,et al.  Application of Recently Developed Computer Algorithm for Automatic Classification of Unstructured Radiology Reports: Validation Study 1 , 2004 .

[4]  Sebastian Thrun,et al.  Dermatologist-level classification of skin cancer with deep neural networks , 2017, Nature.

[5]  Sasank Chilamkurthy,et al.  Deep learning algorithms for detection of critical findings in head CT scans: a retrospective study , 2018, The Lancet.

[6]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[7]  Zhi-Hua Zhou,et al.  Distributional features for text categorization , 2006 .

[8]  Christopher J. C. Burges,et al.  A Tutorial on Support Vector Machines for Pattern Recognition , 1998, Data Mining and Knowledge Discovery.

[9]  Yiyu Cai,et al.  Deep Learning for Chest Radiology: A Review , 2019, Current Radiology Reports.

[10]  N L Jain,et al.  Identification of suspected tuberculosis patients based on natural language processing of chest radiograph reports. , 1996, Proceedings : a conference of the American Medical Informatics Association. AMIA Fall Symposium.

[11]  Evan Narasimhan,et al.  Feasibility of Natural Language Processing-Assisted Auditing of Critical Findings in Chest Radiology. , 2019, Journal of the American College of Radiology : JACR.

[12]  Curtis P. Langlotz,et al.  Automated Detection of Critical Results in Radiology Reports , 2012, Journal of Digital Imaging.

[13]  Felipe Soares,et al.  Medical Word Embeddings for Spanish: Development and Evaluation , 2019, Proceedings of the 2nd Clinical Natural Language Processing Workshop.

[14]  J. Hilbe Logistic Regression Models , 2009 .

[15]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[16]  Luigia Romano,et al.  Communication of findings of radiologic examinations: medicolegal considerations. , 2012, Seminars in ultrasound, CT, and MR.

[17]  Saeid Nahavandi,et al.  Bag-of-words representation for biomedical time series classification , 2012, Biomed. Signal Process. Control..

[18]  Oleg S. Pianykh,et al.  Current Applications and Future Impact of Machine Learning in Radiology. , 2018, Radiology.

[19]  Danielle L. Mowery,et al.  Assessing the Feasibility of an Automated Suggestion System for Communicating Critical Findings from Chest Radiology Reports to Referring Physicians , 2016, BioNLP@ACL.

[20]  Marios Anthimopoulos,et al.  Lung Pattern Classification for Interstitial Lung Diseases Using a Deep Convolutional Neural Network , 2016, IEEE Transactions on Medical Imaging.

[21]  Ronald M. Summers,et al.  ChestX-ray: Hospital-Scale Chest X-ray Database and Benchmarks on Weakly Supervised Classification and Localization of Common Thorax Diseases , 2019, Deep Learning and Convolutional Neural Networks for Medical Imaging and Clinical Informatics.

[22]  Loes M. M. Braun,et al.  Natural Language Processing in Radiology: A Systematic Review. , 2016, Radiology.

[23]  María Teresa Martín-Valdivia,et al.  Strengths, Weaknesses, Opportunities, and Threats Analysis of Artificial Intelligence and Machine Learning Applications in Radiology. , 2019, Journal of the American College of Radiology : JACR.

[24]  Lucila Ohno-Machado,et al.  The use of receiver operating characteristic curves in biomedical informatics , 2005, J. Biomed. Informatics.

[25]  H. White,et al.  Logistic regression in the medical literature: standards for use and reporting, with particular attention to one medical domain. , 2001, Journal of clinical epidemiology.

[26]  Daniel L. Rubin,et al.  Radiology report annotation using intelligent word embeddings: Applied to multi-institutional chest CT cohort , 2018, J. Biomed. Informatics.

[27]  A. Evans,et al.  Prostate cancer detection with multi‐parametric MRI: Logistic regression analysis of quantitative T2, diffusion‐weighted imaging, and dynamic contrast‐enhanced MRI , 2009, Journal of magnetic resonance imaging : JMRI.

[28]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[29]  Carol Friedman,et al.  Research Paper: A General Natural-language Text Processor for Clinical Radiology , 1994, J. Am. Medical Informatics Assoc..

[30]  C. Langlotz,et al.  Deep Learning to Classify Radiology Free-Text Reports. , 2017, Radiology.

[31]  E G Lowrie,et al.  Death risk in hemodialysis patients: the predictive value of commonly measured variables and an evaluation of death rate differences between facilities. , 1990, American journal of kidney diseases : the official journal of the National Kidney Foundation.

[32]  M. Zweig,et al.  Receiver-operating characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine. , 1993, Clinical chemistry.

[33]  Kuldip K. Paliwal,et al.  Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..

[34]  Carol Friedman,et al.  Identification of findings suspicious for breast cancer based on natural language processing of mammogram reports , 1997, AMIA.

[35]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.