CT Data Curation for Liver Patients: Phase Recognition in Dynamic Contrast-Enhanced CT

As the demand for more descriptive machine learning models grows within medical imaging, bottlenecks due to data paucity will exacerbate. Thus, collecting enough large-scale data will require automated tools to harvest data/label pairs from messy and real-world datasets, such as hospital PACS. This is the focus of our work, where we present a principled data curation tool to extract multi-phase CT liver studies and identify each scan's phase from a real-world and heterogenous hospital PACS dataset. Emulating a typical deployment scenario, we first obtain a set of noisy labels from our institutional partners that are text mined using simple rules from DICOM tags. We train a deep learning system, using a customized and streamlined 3D SE architecture, to identify non-contrast, arterial, venous, and delay phase dynamic CT liver scans, filtering out anything else, including other types of liver contrast studies. To exploit as much training data as possible, we also introduce an aggregated cross entropy loss that can learn from scans only identified as "contrast". Extensive experiments on a dataset of 43K scans of 7680 patient imaging studies demonstrate that our 3DSE architecture, armed with our aggregated loss, can achieve a mean F1 of 0.977 and can correctly harvest up to 92.7% of studies, which significantly outperforms the text-mined and standard-loss approach, and also outperforms other, and more complex, model architectures.

[1]  Yifan Yu,et al.  CheXpert: A Large Chest Radiograph Dataset with Uncertainty Labels and Expert Comparison , 2019, AAAI.

[2]  Rui Jiang,et al.  Respond-CAM: Analyzing Deep Models for 3D Imaging Data by Visualizations , 2018, MICCAI.

[3]  Bram van Ginneken,et al.  A survey on deep learning in medical image analysis , 2017, Medical Image Anal..

[4]  Bo Zhou,et al.  A Progressively-Trained Scale-Invariant and Boundary-Aware Deep Neural Network for the Automatic 3D Segmentation of Lung Lesions , 2018, 2019 IEEE Winter Conference on Applications of Computer Vision (WACV).

[5]  Gang Sun,et al.  Squeeze-and-Excitation Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[6]  Ronald M. Summers,et al.  NegBio: a high-performance tool for negation and uncertainty detection in radiology reports , 2017, AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science.

[7]  S. Holm A Simple Sequentially Rejective Multiple Test Procedure , 1979 .

[8]  Yutaka Satoh,et al.  Learning Spatio-Temporal Features with 3D Residual Networks for Action Recognition , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[9]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Stephanie R Wilson,et al.  Contrast-enhanced US Approach to the Diagnosis of Focal Liver Masses. , 2017, Radiographics : a review publication of the Radiological Society of North America, Inc.

[11]  Michael Kohnen,et al.  Quality of DICOM header information for image categorization , 2002, SPIE Medical Imaging.

[12]  Bo Zhou,et al.  Generation of Virtual Dual Energy Images from Standard Single-Shot Radiographs using Multi-scale and Conditional Adversarial Network , 2018, ACCV.

[13]  Lorenzo Torresani,et al.  Learning Spatiotemporal Features with 3D Convolutional Networks , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[14]  Le Lu,et al.  DeepLesion: automated mining of large-scale lesion annotations and universal lesion detection with deep learning , 2018, Journal of medical imaging.

[15]  Ben Glocker,et al.  A Standardised Approach for Preparing Imaging Data for Machine Learning Tasks in Radiology , 2019, Artificial Intelligence in Medical Imaging.

[16]  Alexander S. Yeh,et al.  More accurate tests for the statistical significance of result differences , 2000, COLING.

[17]  Ronald M. Summers,et al.  Medical Image Data and Datasets in the Era of Machine Learning—Whitepaper from the 2016 C-MIMI Meeting Dataset Session , 2017, Journal of Digital Imaging.