Neural classification of Norwegian radiology reports: using NLP to detect findings in CT-scans of children

Background With a motivation of quality assurance, machine learning techniques were trained to classify Norwegian radiology reports of paediatric CT examinations according to their description of abnormal findings. Methods 13.506 reports from CT-scans of children, 1000 reports from CT scan of adults and 1000 reports from X-ray examination of adults were classified as positive or negative by a radiologist, according to the presence of abnormal findings. Inter-rater reliability was evaluated by comparison with a clinician’s classifications of 500 reports. Test–retest reliability of the radiologist was performed on the same 500 reports. A convolutional neural network model (CNN), a bidirectional recurrent neural network model (bi-LSTM) and a support vector machine model (SVM) were trained on a random selection of the children’s data set. Models were evaluated on the remaining CT-children reports and the adult data sets. Results Test–retest reliability: Cohen’s Kappa = 0.86 and F1 = 0.919. Inter-rater reliability: Kappa = 0.80 and F1 = 0.885. Model performances on the Children-CT data were as follows. CNN: (AUC = 0.981, F1 = 0.930), bi-LSTM: (AUC = 0.978, F1 = 0.927), SVM: (AUC = 0.975, F1 = 0.912). On the adult data sets, the models had AUC around 0.95 and F1 around 0.91. Conclusions The models performed close to perfectly on its defined domain, and also performed convincingly on reports pertaining to a different patient group and a different modality. The models were deemed suitable for classifying radiology reports for future quality assurance purposes, where the fraction of the examinations with abnormal findings for different sub-groups of patients is a parameter of interest.

[1]  Peng Zhou,et al.  Text Classification Improved by Integrating Bidirectional LSTM with Two-dimensional Max Pooling , 2016, COLING.

[2]  Wendy W. Chapman,et al.  Document-level classification of CT pulmonary angiography reports based on an extension of the ConText algorithm , 2011, J. Biomed. Informatics.

[3]  Peter J. Haug,et al.  Research Paper: Automatic Detection of Acute Bacterial Pneumonia from Chest X-ray Reports , 2000, J. Am. Medical Informatics Assoc..

[4]  B. Lumbreras,et al.  Cumulative exposure to ionising radiation from diagnostic imaging tests: a 12-year follow-up population-based analysis in Spain , 2019, BMJ Open.

[5]  C. Langlotz,et al.  Deep Learning to Classify Radiology Free-Text Reports. , 2017, Radiology.

[6]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .

[7]  J. Mathews,et al.  Cancer risk in 680 000 people exposed to computed tomography scans in childhood or adolescence: data linkage study of 11 million Australians , 2013, BMJ.

[8]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[9]  Lilja Øvrelid,et al.  Universal Dependencies for Norwegian , 2016, LREC.

[10]  Marco Spruit,et al.  Comparing Deep Learning and Classical Machine Learning Approaches for Predicting Inpatient Violence Incidents from Clinical Text , 2018, Applied Sciences.

[11]  Robert Tibshirani,et al.  An Introduction to the Bootstrap , 1994 .

[12]  M. Hauptmann,et al.  Radiation Exposure From Pediatric CT Scans and Subsequent Cancer Risk in the Netherlands , 2018, Journal of the National Cancer Institute.

[13]  J. S. Hughes Ionising radiation exposure of the UK population : 1999 Review , 1999 .

[14]  K. Frush Why and when to use CT in children: perspective of a pediatric emergency medicine physician , 2014, Pediatric Radiology.

[15]  Pierre Zweigenbaum,et al.  Clinical Natural Language Processing in languages other than English: opportunities and challenges , 2018, Journal of Biomedical Semantics.

[16]  Clement J. McDonald,et al.  What can natural language processing do for clinical decision support? , 2009, J. Biomed. Informatics.

[17]  Young Soo Kim,et al.  Automatic Disease Annotation From Radiology Reports Using Artificial Intelligence Implemented by a Recurrent Neural Network. , 2019, AJR. American journal of roentgenology.

[18]  F. Sung,et al.  Paediatric head CT scan and subsequent risk of malignancy and benign brain tumour: a nation-wide population-based cohort study , 2014, British Journal of Cancer.

[19]  Carol Friedman,et al.  Research Paper: A General Natural-language Text Processor for Clinical Radiology , 1994, J. Am. Medical Informatics Assoc..

[20]  M. Muhm,et al.  Pediatric trauma care with computed tomography—criteria for CT scanning , 2015, Emergency Radiology.

[21]  S. Trent Rosenbloom,et al.  NLP-based Identification of Pneumonia Cases from Free-Text Radiological Reports , 2008, AMIA.

[22]  Pragya A. Dang,et al.  Extraction of recommendation features in radiology with natural language processing: exploratory study. , 2008, AJR. American journal of roentgenology.

[23]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.