Big Self-Supervised Models Advance Medical Image Classification

Self-supervised pretraining followed by supervised finetuning has seen success in image recognition, especially when labeled examples are scarce, but has received limited attention in medical image analysis. This paper studies the effectiveness of self-supervised learning as a pretraining strategy for medical image classification. We conduct experiments on two distinct tasks: dermatology condition classification from digital camera images and multilabel chest X-ray classification, and demonstrate that selfsupervised learning on ImageNet, followed by additional self-supervised learning on unlabeled domain-specific medical images significantly improves the accuracy of medical image classifiers. We introduce a novel Multi-Instance Contrastive Learning (MICLe) method that uses multiple images of the underlying pathology per patient case, when available, to construct more informative positive pairs for self-supervised learning. Combining our contributions, we achieve an improvement of 6.7% in top-1 accuracy and an improvement of 1.1% in mean AUC on dermatology and chest X-ray classification respectively, outperforming strong supervised baselines pretrained on ImageNet. In addition, we show that big self-supervised models are robust to distribution shift and can learn efficiently with a small number of labeled medical images.

[1]  Geoffrey E. Hinton,et al.  Self-organizing neural network that discovers surfaces in random-dot stereograms , 1992, Nature.

[2]  Ali Razavi,et al.  Data-Efficient Image Recognition with Contrastive Predictive Coding , 2019, ICML.

[3]  Yuan Zhang,et al.  FocalMix: Semi-Supervised Learning for 3D Medical Image Detection , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Ender Konukoglu,et al.  Contrastive learning of global and local features for medical image segmentation with limited annotations , 2020, NeurIPS.

[5]  Shih-Fu Chang,et al.  Unsupervised Embedding Learning via Invariant and Spreading Instance Feature , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Krishna Chaitanya,et al.  Imbalance-Aware Self-Supervised Learning for 3D Radiomic Representations , 2021, MICCAI.

[7]  Behnam Neyshabur,et al.  What is being transferred in transfer learning? , 2020, NeurIPS.

[8]  Josien P. W. Pluim,et al.  Not‐so‐supervised: A survey of semi‐supervised, multi‐instance, and transfer learning in medical image analysis , 2018, Medical Image Anal..

[9]  Kai Ma,et al.  Rubik's Cube+: A self-supervised feature learning framework for 3D medical image analysis , 2020, Medical Image Anal..

[10]  Anne L. Martel,et al.  Self supervised contrastive learning for digital histopathology , 2020, Machine Learning with Applications.

[11]  Yoshua Bengio,et al.  Learning deep representations by mutual information estimation and maximization , 2018, ICLR.

[12]  Kaiming He,et al.  Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Mohammed A. Fadhel,et al.  Towards a Better Understanding of Transfer Learning for Medical Imaging: A Case Study , 2020, Applied Sciences.

[14]  Phillip Isola,et al.  Contrastive Multiview Coding , 2019, ECCV.

[15]  S. Gelly,et al.  Self-Supervised Learning of Video-Induced Visual Invariances , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Christopher D. Manning,et al.  Contrastive Learning of Medical Visual Representations from Paired Images and Text , 2020, MLHC.

[17]  Kaiming He,et al.  Improved Baselines with Momentum Contrastive Learning , 2020, ArXiv.

[18]  Andrew Y. Ng,et al.  MoCo Pretraining Improves Representation and Transferability of Chest X-ray Models , 2020, MIDL.

[19]  X. He,et al.  Sample-Efficient Deep Learning for COVID-19 Diagnosis Based on CT Scans , 2020, medRxiv.

[20]  Yang You,et al.  Large Batch Training of Convolutional Networks , 2017, 1708.03888.

[21]  Kai Ma,et al.  Med3D: Transfer Learning for 3D Medical Image Analysis , 2019, ArXiv.

[22]  Eduardo Valle,et al.  Knowledge transfer for melanoma screening with deep learning , 2017, 2017 IEEE 14th International Symposium on Biomedical Imaging (ISBI 2017).

[23]  Liang Chen,et al.  Self-supervised learning for medical image analysis using image context restoration , 2019, Medical Image Anal..

[24]  Viktor Wegmayr,et al.  Transfer Learning by Adaptive Merging of Multiple Models , 2018, MIDL.

[25]  Lequan Yu,et al.  Semi-Supervised Medical Image Classification With Relation-Driven Self-Ensembling Model , 2020, IEEE Transactions on Medical Imaging.

[26]  Nikos Komodakis,et al.  Unsupervised Representation Learning by Predicting Image Rotations , 2018, ICLR.

[27]  Yu Fei,et al.  Align, Attend and Locate: Chest X-Ray Diagnosis via Contrast Induced Attention Network With Limited Supervision , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[28]  Jon Kleinberg,et al.  Transfusion: Understanding Transfer Learning for Medical Imaging , 2019, NeurIPS.

[29]  Oriol Vinyals,et al.  Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.

[30]  Laurens van der Maaten,et al.  Self-Supervised Learning of Pretext-Invariant Representations , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Geoffrey E. Hinton,et al.  A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.

[32]  Rainer Hofmann-Wellenhof,et al.  A deep learning system for differential diagnosis of skin diseases , 2019, Nature Medicine.

[33]  Kaiming He,et al.  Exploring the Limits of Weakly Supervised Pretraining , 2018, ECCV.

[34]  Yujiu Yang,et al.  Self-supervised Feature Learning for 3D Medical Images by Playing a Rubik's Cube , 2019, MICCAI.

[35]  Hayit Greenspan,et al.  Joint Liver Lesion Segmentation and Classification via Transfer Learning , 2020, ArXiv.

[36]  Lucas Beyer,et al.  Big Transfer (BiT): General Visual Representation Learning , 2020, ECCV.

[37]  Paolo Favaro,et al.  Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles , 2016, ECCV.

[38]  David S. Melnick,et al.  International evaluation of an AI system for breast cancer screening , 2020, Nature.

[39]  Timo Dickscheid,et al.  Improving Cytoarchitectonic Segmentation of Human Brain Areas with Self-supervised Siamese Networks , 2018, MICCAI.

[40]  R Devon Hjelm,et al.  Learning Representations by Maximizing Mutual Information Across Views , 2019, NeurIPS.

[41]  Alexei A. Efros,et al.  Unsupervised Visual Representation Learning by Context Prediction , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[42]  Hongming Shan,et al.  Dual Network Architecture for Few-view CT - Trained on ImageNet Data and Transferred for Medical Imaging , 2019, Developments in X-Ray Tomography XII.

[43]  Lixin Zheng,et al.  A transfer learning method with deep residual network for pediatric pneumonia diagnosis , 2020, Comput. Methods Programs Biomed..

[44]  Daniel Rueckert,et al.  Self-Supervised Learning for Cardiac MR Image Segmentation by Anatomical Position Prediction , 2019, MICCAI.

[45]  Ronald M. Summers,et al.  ChestX-ray: Hospital-Scale Chest X-ray Database and Benchmarks on Weakly Supervised Classification and Localization of Common Thorax Diseases , 2019, Deep Learning and Convolutional Neural Networks for Medical Imaging and Clinical Informatics.

[46]  Stella X. Yu,et al.  Unsupervised Feature Learning via Non-parametric Instance Discrimination , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[47]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  Henning Müller,et al.  Visualizing and interpreting feature reuse of pretrained CNNs for histopathology , 2019 .

[49]  Mohammad Norouzi,et al.  Big Self-Supervised Models are Strong Semi-Supervised Learners , 2020, NeurIPS.

[50]  Sergey Levine,et al.  Time-Contrastive Networks: Self-Supervised Learning from Video , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[51]  Shuang Yu,et al.  Comparing to Learn: Surpassing ImageNet Pretraining on Radiographs By Comparing Image Representations , 2020, MICCAI.

[52]  Yifan Yu,et al.  CheXpert: A Large Chest Radiograph Dataset with Uncertainty Labels and Expert Comparison , 2019, AAAI.

[53]  Alexei A. Efros,et al.  Colorful Image Colorization , 2016, ECCV.