Intra-model Variability in COVID-19 Classification Using Chest X-ray Images

X-ray and computed tomography (CT) scanning technologies for COVID-19 screening have gained significant traction in AI research since the start of the coronavirus pandemic. Despite these continuous advancements for COVID-19 screening, many concerns remain about model reliability when used in a clinical setting. Much has been published, but with limited transparency in expected model performance. We set out to address this limitation through a set of experiments to quantify baseline performance metrics and variability for COVID-19 detection in chest x-ray for 12 common deep learning architectures. Specifically, we adopted an experimental paradigm controlling for train-validation-test split and model architecture where the source of prediction variability originates from model weight initialization, random data augmentation transformations, and batch shuffling. Each model architecture was trained 5 separate times on identical train-validation-test splits of a publicly available x-ray image dataset provided by Cohen et al. (2020). Results indicate that even within model architectures, model behavior varies in a meaningful way between trained models. Best performing models achieve a false negative rate of 3 out of 20 for detecting COVID-19 in a hold-out set. While these results show promise in using AI for COVID-19 screening, they further support the urgent need for diverse medical imaging datasets for model training in a way that yields consistent prediction outcomes. It is our hope that these modeling results accelerate work in building a more robust dataset and a viable screening tool for COVID-19.

[1]  Hayit Greenspan,et al.  Coronavirus Detection and Analysis on Chest CT with Deep Learning , 2020, ArXiv.

[2]  Alexander Wong,et al.  COVID-Net: a tailored deep convolutional neural network design for detection of COVID-19 cases from chest X-ray images , 2020, Scientific reports.

[3]  Paolo Napoletano,et al.  Benchmark Analysis of Representative Deep Neural Network Architectures , 2018, IEEE Access.

[4]  K. Cao,et al.  Artificial Intelligence Distinguishes COVID-19 from Community Acquired Pneumonia on Chest CT , 2020, Radiology.

[5]  Zhuowen Tu,et al.  Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  K. Cao,et al.  Using Artificial Intelligence to Detect COVID-19 and Community-acquired Pneumonia Based on Pulmonary CT: Evaluation of the Diagnostic Accuracy , 2020 .

[7]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[8]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[9]  Kevin A. Schneider,et al.  Automatic Detection of Coronavirus Disease (COVID-19) in X-ray and CT Images: A Machine Learning Based Approach , 2020, Biocybernetics and Biomedical Engineering.

[10]  Nikos Komodakis,et al.  Wide Residual Networks , 2016, BMVC.

[11]  Dinggang Shen,et al.  Review of Artificial Intelligence Techniques in Imaging Data Acquisition, Segmentation, and Diagnosis for COVID-19 , 2020, IEEE Reviews in Biomedical Engineering.

[12]  Q. Tao,et al.  Correlation of Chest CT and RT-PCR Testing in Coronavirus Disease 2019 (COVID-19) in China: A Report of 1014 Cases , 2020, Radiology.

[13]  Yael Mandel-Gutfreund,et al.  Evaluation of COVID-19 RT-qPCR test in multi-sample pools , 2020, medRxiv.

[14]  Yaozong Gao,et al.  Large-Scale Screening of COVID-19 from Community Acquired Pneumonia using Infection Size-Aware Classification , 2020, ArXiv.

[15]  M. Castillo,et al.  The Industry of CT Scanning , 2012, American Journal of Neuroradiology.

[16]  Ioannis D. Apostolopoulos,et al.  Extracting Possibly Representative COVID-19 Biomarkers from X-ray Images with Deep Learning Approach and Image Data Related to Pulmonary Diseases , 2020, Journal of medical and biological engineering.

[17]  Jing Xu,et al.  MiniSeg: An Extremely Minimum Network for Efficient COVID-19 Segmentation , 2020, AAAI.

[18]  Xiaowei Xu,et al.  A Deep Learning System to Screen Novel Coronavirus Disease 2019 Pneumonia , 2020, Engineering.

[19]  Hadley Wickham,et al.  ggplot2 - Elegant Graphics for Data Analysis (2nd Edition) , 2017 .

[20]  Jonathan H. Chung,et al.  Essentials for Radiologists on COVID-19: An Update—Radiology Scientific Expert Panel , 2020, Radiology.

[21]  Alexander Wong,et al.  COVID-Net: A Tailored Deep Convolutional Neural Network Design for Detection of COVID-19 Cases from Chest Radiography Images , 2020, ArXiv.

[22]  Thomas G. Dietterich Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms , 1998, Neural Computation.

[23]  Jannis Born,et al.  POCOVID-Net: Automatic Detection of COVID-19 From a New Lung Ultrasound Imaging Dataset (POCUS) , 2020, ArXiv.

[24]  Y. Hu,et al.  Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China , 2020, The Lancet.

[25]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[28]  K. Yuen,et al.  Imaging Profile of the COVID-19 Infection: Radiologic Findings and Literature Review , 2020, Radiology. Cardiothoracic imaging.

[29]  Joseph Paul Cohen,et al.  COVID-19 Image Data Collection , 2020, ArXiv.

[30]  Pedro M. Valero-Mora,et al.  ggplot2: Elegant Graphics for Data Analysis , 2010 .

[31]  Lian-lian Wu,et al.  Deep learning-based model for detecting 2019 novel coronavirus pneumonia on high-resolution computed tomography: a prospective study , 2020, medRxiv.

[32]  Wei Zhao,et al.  COVID-19 Chest CT Image Segmentation - A Deep Convolutional Neural Network Solution , 2020, ArXiv.