SPCXR: Self-supervised Pretraining using Chest X-rays Towards a Domain Specific Foundation Model

Chest X-rays (CXRs) are a widely used imaging modality for the diagnosis and prognosis of lung disease. The image analysis tasks vary. Examples include pathology detection and lung segmentation. There is a large body of work where machine learning algorithms are developed for specific tasks. A significant recent example is Coronavirus disease (covid-19) detection using CXR data. However, the traditional diagnostic tool design methods based on supervised learning are burdened by the need to provide training data annotation, which should be of good quality for better clinical outcomes. Here, we propose an alternative solution, a new self-supervised paradigm, where a general representation from CXRs is learned using a group-masked self-supervised framework. The pre-trained model is then fine-tuned for domain-specific tasks such as covid-19, pneumonia detection, and general health screening. We show that the same pre-training can be used for the lung segmentation task. Our proposed paradigm shows robust performance in multiple downstream tasks which demonstrates the success of the pre-training. Moreover, the performance of the pre-trained models on data with significant drift during test time proves the learning of a better generic representation. The methods are further validated by covid-19 detection in a unique small-scale pediatric data set. The performance gain in accuracy (~25%) is significant when compared to a supervised transformer-based method. This adds credence to the strength and reliability of our proposed framework and pre-training strategy.

[1]  S. Anwar,et al.  SB-SSL: Slice-Based Self-Supervised Transformers for Knee Abnormality Classification from MRI , 2022, MILLanD@MICCAI.

[2]  Hossein Aboutalebi,et al.  COVIDx CXR-3: A Large-Scale, Open-Source Benchmark Dataset of Chest X-ray Images for Computer-Aided COVID-19 Diagnostics , 2022, ArXiv.

[3]  J. Kittler,et al.  GMML is All you Need , 2022, ArXiv.

[4]  Fabricio A. Breve,et al.  COVID-19 Detection on Chest X-Ray Images: A comparison of CNN architectures and ensembles , 2022, Expert systems with applications.

[5]  Zekai Chen,et al.  Masked Image Modeling Advances 3D Medical Image Analysis , 2022, 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV).

[6]  M. Fraz,et al.  Vision Transformers in Medical Computer Vision - A Contemplative Retrospection , 2022, ArXiv.

[7]  Josef Kittler,et al.  MC-SSL0.0: Towards Multi-Concept Self-Supervised Learning , 2021, ArXiv.

[8]  Han Hu,et al.  SimMIM: a Simple Framework for Masked Image Modeling , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Li Dong,et al.  BEiT: BERT Pre-Training of Image Transformers , 2021, ICLR.

[10]  J. Nam,et al.  COVID-19 pneumonia on chest X-rays: Performance of a deep learning-based computer-aided detection system , 2021, PloS one.

[11]  Audrey G. Chung,et al.  COVID-Net CXR-2: An Enhanced Deep Convolutional Neural Network Design for Detection of COVID-19 Cases From Chest X-ray Images , 2021, Frontiers in Medicine.

[12]  Sara Atito Ali Ahmed,et al.  SiT: Self-supervised vIsion Transformer , 2021, ArXiv.

[13]  Daguang Xu,et al.  UNETR: Transformers for 3D Medical Image Segmentation , 2021, 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV).

[14]  Farah E. Shamout,et al.  COVID-19 Prognosis via Self-Supervised Representation Learning and Multi-Image Prediction , 2021, ArXiv.

[15]  J. Choi,et al.  Deep learning-based computer-aided diagnosis in screening breast ultrasound to reduce false-positive diagnoses , 2021, Scientific Reports.

[16]  D. Tao,et al.  A Survey on Vision Transformer , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Matthieu Cord,et al.  Training data-efficient image transformers & distillation through attention , 2020, ICML.

[18]  S. Gelly,et al.  An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2020, ICLR.

[19]  Jianfeng Gao,et al.  Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing , 2020, ACM Trans. Comput. Heal..

[20]  Mark Chen,et al.  Generative Pretraining From Pixels , 2020, ICML.

[21]  Pierre H. Richemond,et al.  Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning , 2020, NeurIPS.

[22]  Amos Storkey,et al.  Self-Supervised Relational Reasoning for Representation Learning , 2020, NeurIPS.

[23]  Steven Woloshin,et al.  False Negative Tests for SARS-CoV-2 Infection - Challenges and Implications. , 2020, The New England journal of medicine.

[24]  Miguel Cazorla,et al.  BIMCV COVID-19+: a large annotated dataset of RX and CT images from COVID-19 patients , 2020, ArXiv.

[25]  A. Wong,et al.  COVID-Net: a tailored deep convolutional neural network design for detection of COVID-19 cases from chest X-ray images , 2020, Scientific Reports.

[26]  Geoffrey E. Hinton,et al.  A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.

[27]  Quoc V. Le,et al.  Selfie: Self-supervised Pretraining for Image Embedding , 2019, ArXiv.

[28]  R Devon Hjelm,et al.  Learning Representations by Maximizing Mutual Information Across Views , 2019, NeurIPS.

[29]  R. Devon Hjelm,et al.  Learning deep representations by mutual information estimation and maximization , 2018, ICLR.

[30]  Marius George Linguraru,et al.  A Generic Approach to Lung Field Segmentation From Chest Radiographs Using Deep Space and Shape Learning , 2018, IEEE Transactions on Biomedical Engineering.

[31]  Oriol Vinyals,et al.  Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.

[32]  Stella X. Yu,et al.  Unsupervised Feature Learning via Non-parametric Instance Discrimination , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[33]  Nikos Komodakis,et al.  Unsupervised Representation Learning by Predicting Image Rotations , 2018, ICLR.

[34]  Frank Hutter,et al.  Fixing Weight Decay Regularization in Adam , 2017, ArXiv.

[35]  Alexei A. Efros,et al.  Context Encoders: Feature Learning by Inpainting , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Paolo Favaro,et al.  Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles , 2016, ECCV.

[37]  Alexei A. Efros,et al.  Colorful Image Colorization , 2016, ECCV.

[38]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[39]  Clement J. McDonald,et al.  Automatic Tuberculosis Screening Using Chest Radiographs , 2014, IEEE Transactions on Medical Imaging.

[40]  Clement J. McDonald,et al.  Lung Segmentation in Chest Radiographs Using Anatomical Atlases With Nonrigid Registration , 2014, IEEE Transactions on Medical Imaging.

[41]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[42]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[43]  Language Understanding , 2021, Encyclopedia of Autism Spectrum Disorders.

[44]  Unhcr Dashboard,et al.  COVID-19 in children , 2020 .

[45]  Ronald M. Summers,et al.  ChestX-ray: Hospital-Scale Chest X-ray Database and Benchmarks on Weakly Supervised Classification and Localization of Common Thorax Diseases , 2019, Deep Learning and Convolutional Neural Networks for Medical Imaging and Clinical Informatics.

[46]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[47]  M. Kramer Nonlinear principal component analysis using autoassociative neural networks , 1991 .