论文信息 - Generalized radiograph representation learning via cross-supervision between images and free-text radiology reports

Generalized radiograph representation learning via cross-supervision between images and free-text radiology reports

Pre-training lays the foundation for recent successes in radiograph analysis supported by deep learning. It learns transferable image representations by conducting large-scale fully-supervised or self-supervised learning on a source domain. However, supervised pre-training requires a complex and labor intensive two-stage human-assisted annotation process while self-supervised learning cannot compete with the supervised paradigm. To tackle these issues, we propose a cross-supervised methodology named REviewing FreE-text Reports for Supervision (REFERS), which acquires free supervision signals from original radiology reports accompanying the radiographs. The proposed approach employs a vision transformer and is designed to learn joint representations from multiple views within every patient study. REFERS outperforms its transfer learning and self-supervised learning counterparts on 4 well-known X-ray datasets under extremely limited supervision. Moreover, REFERS even surpasses methods based on a source domain of radiographs with human-assisted structured labels. Thus REFERS has the potential to replace canonical pre-training methodologies.

[1] Zongwei Zhou,et al. Transferable Visual Words: Exploiting the Semantics of Anatomical Patterns for Self-Supervised Learning , 2021, IEEE Transactions on Medical Imaging.

[2] Bolei Zhou,et al. Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3] Hao Wu,et al. Mixed Precision Training , 2017, ICLR.

[4] 知秀柴田. 5分で分かる!? 有名論文ナナメ読み：Jacob Devlin et al. : BERT : Pre-training of Deep Bidirectional Transformers for Language Understanding , 2020 .

[5] Aapo Hyvärinen,et al. Noise-contrastive estimation: A new estimation principle for unnormalized statistical models , 2010, AISTATS.

[6] Ruibang Luo,et al. Generalized radiograph representation learning via cross-supervision between images and free-text radiology reports , 2022, Nature Machine Intelligence.

[7] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[8] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9] Kaiming He,et al. Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10] Ronald M. Summers,et al. ChestX-ray: Hospital-Scale Chest X-ray Database and Benchmarks on Weakly Supervised Classification and Localization of Common Thorax Diseases , 2019, Deep Learning and Convolutional Neural Networks for Medical Imaging and Clinical Informatics.

[11] C. Dameff,et al. Deployment of artificial intelligence for radiographic diagnosis of COVID‐19 pneumonia in the emergency department , 2020, Journal of the American College of Emergency Physicians open.

[12] Abhinav Gupta,et al. Scaling and Benchmarking Self-Supervised Visual Representation Learning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[13] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14] Joseph Paul Cohen,et al. COVID-19 Image Data Collection: Prospective Predictions Are the Future , 2020, ArXiv.

[15] Pascal Vincent,et al. Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[17] J. Mongan,et al. Automated detection of moderate and large pneumothorax on frontal chest X-rays using deep convolutional neural networks: A retrospective study , 2018, PLoS medicine.

[18] Stefan Jaeger,et al. Two public chest X-ray datasets for computer-aided screening of pulmonary diseases. , 2014, Quantitative imaging in medicine and surgery.

[19] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[20] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[21] Frank Hutter,et al. SGDR: Stochastic Gradient Descent with Warm Restarts , 2016, ICLR.

[22] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[23] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[24] Tara N. Sainath,et al. Improving deep neural networks for LVCSR using rectified linear units and dropout , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[25] Nguyet Minh Phu,et al. CheXphoto: 10, 000+ Photos and Transformations of Chest X-rays for Benchmarking Deep Learning Robustness , 2020, ML4H@NeurIPS.

[26] Ronald M. Summers,et al. TieNet: Text-Image Embedding Network for Common Thorax Disease Classification and Reporting in Chest X-Rays , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[27] Ronald M. Summers,et al. Interleaved text/image Deep Mining on a large-scale radiology database , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[29] Liang Chen,et al. Self-supervised learning for medical image analysis using image context restoration , 2019, Medical Image Anal..

[30] Geoffrey E. Hinton,et al. On the importance of initialization and momentum in deep learning , 2013, ICML.

[31] Binh T. Nguyen,et al. VinDr-CXR: An open dataset of chest X-rays with radiologist's annotations , 2020 .

[32] Georg Heigold,et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2021, ICLR.

[33] Yoshua Bengio,et al. How transferable are features in deep neural networks? , 2014, NIPS.

[34] Roger G. Mark,et al. MIMIC-CXR: A large publicly available database of labeled chest radiographs , 2019, ArXiv.

[35] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[36] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37] Geoffrey E. Hinton,et al. Layer Normalization , 2016, ArXiv.

[38] John Tran,et al. cuDNN: Efficient Primitives for Deep Learning , 2014, ArXiv.

[39] Shuang Yu,et al. Comparing to Learn: Surpassing ImageNet Pretraining on Radiographs By Comparing Image Representations , 2020, MICCAI.

[40] Yifan Yu,et al. CheXpert: A Large Chest Radiograph Dataset with Uncertainty Labels and Expert Comparison , 2019, AAAI.

[41] Steven Horng,et al. MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports , 2019, Scientific Data.