Deep Mining External Imperfect Data for Chest X-Ray Disease Screening

Deep learning approaches have demonstrated remarkable progress in automatic Chest X-ray analysis. The data-driven feature of deep models requires training data to cover a large distribution. Therefore, it is substantial to integrate knowledge from multiple datasets, especially for medical images. However, learning a disease classification model with extra Chest X-ray (CXR) data is yet challenging. Recent researches have demonstrated that performance bottleneck exists in joint training on different CXR datasets, and few made efforts to address the obstacle. In this paper, we argue that incorporating an external CXR dataset leads to imperfect training data, which raises the challenges. Specifically, the imperfect data is in two folds: domain discrepancy, as the image appearances vary across datasets; and label discrepancy, as different datasets are partially labeled. To this end, we formulate the multi-label thoracic disease classification problem as weighted independent binary tasks according to the categories. For common categories shared across domains, we adopt task-specific adversarial training to alleviate the feature differences. For categories existing in a single dataset, we present uncertainty-aware temporal ensembling of model predictions to mine the information from the missing labels further. In this way, our framework simultaneously models and tackles the domain and label discrepancies, enabling superior knowledge mining ability. We conduct extensive experiments on three datasets with more than 360,000 Chest X-ray images. Our method outperforms other competing models and sets state-of-the-art performance on the official NIH test set with 0.8349 AUC, demonstrating its effectiveness of utilizing the external dataset to improve the internal classification.

[1]  Chi-Wing Fu,et al.  Uncertainty-aware Self-ensembling Model for Semi-supervised 3D Left Atrium Segmentation , 2019, MICCAI.

[2]  Marcus A. Badgeley,et al.  Variable generalization performance of a deep learning model to detect pneumonia in chest radiographs: A cross-sectional study , 2018, PLoS medicine.

[3]  Li Yao,et al.  Learning to diagnose from scratch by exploiting dependencies among labels , 2017, ArXiv.

[4]  Lin Yang,et al.  Translating and Segmenting Multimodal Medical Volumes with Cycle- and Shape-Consistency Generative Adversarial Network , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[5]  Li Yao,et al.  A Strong Baseline for Domain Adaptation and Generalization in Medical Imaging , 2019, ArXiv.

[6]  Li Tong,et al.  Mitigating the Effect of Dataset Bias on Training Deep Models for Chest X-rays , 2019, ArXiv.

[7]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[8]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[9]  Dorin Comaniciu,et al.  Quantifying and Leveraging Classification Uncertainty for Chest Radiograph Assessment , 2019, MICCAI.

[10]  David F. Steiner,et al.  Chest Radiograph Interpretation with Deep Learning Models: Assessment with Radiologist-adjudicated Reference Standards and Population-adjusted Evaluation. , 2019, Radiology.

[11]  Yue Zhang,et al.  Task Driven Generative Modeling for Unsupervised Domain Adaptation: Application to X-ray Image Segmentation , 2018, MICCAI.

[12]  Ronald M. Summers,et al.  ChestX-ray: Hospital-Scale Chest X-ray Database and Benchmarks on Weakly Supervised Classification and Localization of Common Thorax Diseases , 2019, Deep Learning and Convolutional Neural Networks for Medical Imaging and Clinical Informatics.

[13]  Ruoyu Li,et al.  Weakly Supervised Deep Learning for Thoracic Disease Classification and Localization on Chest X-rays , 2018, BCB.

[14]  Gijs van Tulder,et al.  Learning Cross-Modality Representations From Multi-Modal Images , 2019, IEEE Transactions on Medical Imaging.

[15]  Lequan Yu,et al.  MS-Net: Multi-Site Network for Improving Prostate Segmentation With Heterogeneous MRI Data , 2020, IEEE Transactions on Medical Imaging.

[16]  Ronald M. Summers,et al.  TieNet: Text-Image Embedding Network for Common Thorax Disease Classification and Reporting in Chest X-Rays , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[17]  Andrea Vedaldi,et al.  Learning multiple visual domains with residual adapters , 2017, NIPS.

[18]  Chi-Wing Fu,et al.  Patch-Based Output Space Adversarial Learning for Joint Optic Disc and Cup Segmentation , 2019, IEEE Transactions on Medical Imaging.

[19]  Ronald M. Summers,et al.  Holistic and Comprehensive Annotation of Clinically Significant Findings on Diverse CT Images: Learning From Radiology Reports and Label Ontology , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Dong Yang,et al.  3D Semi-Supervised Learning with Uncertainty-Aware Multi-View Co-Training , 2018, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[21]  Thomas L. Griffiths,et al.  Human Uncertainty Makes Classification More Robust , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[22]  Yifan Yu,et al.  CheXpert: A Large Chest Radiograph Dataset with Uncertainty Labels and Expert Comparison , 2019, AAAI.

[23]  Hao Chen,et al.  Unsupervised Bidirectional Cross-Modality Adaptation via Deeply Synergistic Image and Feature Alignment for Medical Image Segmentation , 2020, IEEE Transactions on Medical Imaging.

[24]  Wei Liu,et al.  Multi-label Learning with Missing Labels Using Mixed Dependency Graphs , 2018, International Journal of Computer Vision.

[25]  Hao Chen,et al.  Transformation-Consistent Self-Ensembling Model for Semisupervised Medical Image Segmentation , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[26]  Emanuele Pesce,et al.  Learning to detect chest radiographs containing pulmonary lesions using visual attention networks , 2017, Medical Image Anal..

[27]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[28]  Andrew L. Maas Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .

[29]  Nima Tajbakhsh,et al.  Embracing Imperfect Datasets: A Review of Deep Learning Solutions for Medical Image Segmentation , 2019, Medical Image Anal..

[30]  Andrew Y. Ng,et al.  CheXNet: Radiologist-Level Pneumonia Detection on Chest X-Rays with Deep Learning , 2017, ArXiv.

[31]  Andrea Vedaldi,et al.  Instance Normalization: The Missing Ingredient for Fast Stylization , 2016, ArXiv.

[32]  Li Tong,et al.  Improve Model Generalization and Robustness to Dataset Bias with Bias-regularized Learning and Domain-guided Augmentation , 2019 .

[33]  Hao Chen,et al.  UD-MIL: Uncertainty-Driven Deep Multiple Instance Learning for OCT Image Classification , 2020, IEEE Journal of Biomedical and Health Informatics.

[34]  Yaping Huang,et al.  Multi-label chest X-ray image classification via category-wise residual attention learning , 2020, Pattern Recognit. Lett..

[35]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[36]  J. Gohagan,et al.  The Prostate, Lung, Colorectal and Ovarian (PLCO) Cancer Screening Trial of the National Cancer Institute: history, organization, and status. , 2000, Controlled clinical trials.

[37]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[38]  Xiaoou Tang,et al.  Two at Once: Enhancing Learning and Generalization Capacities via IBN-Net , 2018, ECCV.

[39]  Dorin Comaniciu,et al.  Learning to recognize Abnormalities in Chest X-Rays with Location-Aware Dense Networks , 2018, CIARP.

[40]  Wei Wei,et al.  Thoracic Disease Identification and Localization with Limited Supervision , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[41]  Xinlei Chen,et al.  Prior-Aware Neural Network for Partially-Supervised Multi-Organ Segmentation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[42]  Jianfei Cai,et al.  Improving Multi-label Learning with Missing Labels by Structured Semantic Correlations , 2016, ECCV.

[43]  Joseph Paul Cohen,et al.  On the limits of cross-domain generalization in automated X-ray prediction , 2020, MIDL.

[44]  Jianmin Wang,et al.  Transferability vs. Discriminability: Batch Spectral Penalization for Adversarial Domain Adaptation , 2019, ICML.

[45]  David Zhang,et al.  Lesion Location Attention Guided Network for Multi-Label Thoracic Disease Classification in Chest X-Rays , 2019, IEEE Journal of Biomedical and Health Informatics.

[46]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[47]  Hao Chen,et al.  Towards multi-center glaucoma OCT image screening with semi-supervised joint structure and function multi-task learning , 2020, Medical Image Anal..

[48]  Greg Mori,et al.  Learning a Deep ConvNet for Multi-Label Classification With Partial Labels , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[49]  G. Lodwick,et al.  THE CODING OF ROENTGEN IMAGES FOR COMPUTER ANALYSIS AS APPLIED TO LUNG CANCER. , 1963, Radiology.

[50]  Zhao Chen,et al.  GradNorm: Gradient Normalization for Adaptive Loss Balancing in Deep Multitask Networks , 2017, ICML.

[51]  Daguang Xu,et al.  Generalizing Deep Learning for Medical Image Segmentation to Unseen Domains via Deep Stacked Transformation , 2020, IEEE Transactions on Medical Imaging.

[52]  Axel Saalbach,et al.  Continual Learning for Domain Adaptation in Chest X-ray Classification , 2020, MIDL.

[53]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[54]  Yuxing Tang,et al.  Attention-Guided Curriculum Learning for Weakly Supervised Classification and Localization of Thoracic Diseases on Chest Radiographs , 2018, MLMI@MICCAI.

[55]  Lequan Yu,et al.  Semi-Supervised Medical Image Classification With Relation-Driven Self-Ensembling Model , 2020, IEEE Transactions on Medical Imaging.

[56]  David Berthelot,et al.  MixMatch: A Holistic Approach to Semi-Supervised Learning , 2019, NeurIPS.

[57]  Xingrui Yu,et al.  Co-teaching: Robust training of deep neural networks with extremely noisy labels , 2018, NeurIPS.

[58]  Roger G. Mark,et al.  MIMIC-CXR: A large publicly available database of labeled chest radiographs , 2019, ArXiv.

[59]  Timo Aila,et al.  Temporal Ensembling for Semi-Supervised Learning , 2016, ICLR.