MVCNet: Multiview Contrastive Network for Unsupervised Representation Learning for 3D CT Lesions

Objective and Impact Statement. With the renaissance of deep learning, automatic diagnostic systems for computed tomography (CT) have achieved many successful applications. However, they are mostly attributed to careful expert annotations, which are often scarce in practice. This drives our interest to the unsupervised representation learning. Introduction. Recent studies have shown that self-supervised learning is an effective approach for learning representations, but most of them rely on the empirical design of transformations and pretext tasks. Methods. To avoid the subjectivity associated with these methods, we propose the MVCNet, a novel unsupervised three dimensional (3D) representation learning method working in a transformation-free manner. We view each 3D lesion from different orientations to collect multiple two dimensional (2D) views. Then, an embedding function is learned by minimizing a contrastive loss so that the 2D views of the same 3D lesion are aggregated, and the 2D views of different lesions are separated. We evaluate the representations by training a simple classification head upon the embedding layer. Results. Experimental results show that MVCNet achieves state-of-the-art accuracies on the LIDC-IDRI (89.55%), LNDb (77.69%) and TianChi (79.96%) datasets for unsupervised representation learning. When fine-tuned on 10% of the labeled data, the accuracies are comparable to the supervised learning model (89.46% vs. 85.03%, 73.85% vs. 73.44%, 83.56% vs. 83.34% on the three datasets, respectively). Conclusion. Results indicate the superiority of MVCNet in learning representations with limited annotations. 1 ar X iv :2 10 8. 07 66 2v 2 [ cs .C V ] 1 8 A ug 2 02 1

[1]  Weidong Cai,et al.  Knowledge-based Collaborative Deep Learning for Benign-Malignant Lung Nodule Classification on Chest CT , 2019, IEEE Transactions on Medical Imaging.

[2]  Bram van Ginneken,et al.  Pulmonary Nodule Detection in CT Images: False Positive Reduction Using Multi-View Convolutional Networks , 2016, IEEE Transactions on Medical Imaging.

[3]  Stella X. Yu,et al.  Unsupervised Feature Learning via Non-parametric Instance Discrimination , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[4]  Lequan Yu,et al.  Self-Supervised Feature Learning via Exploiting Multi-Modal Data for Retinal Disease Diagnosis , 2020, IEEE Transactions on Medical Imaging.

[5]  Subhransu Maji,et al.  Multi-view Convolutional Neural Networks for 3D Shape Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[6]  Phillip Isola,et al.  Contrastive Multiview Coding , 2019, ECCV.

[7]  Kai Ma,et al.  Rubik's Cube+: A self-supervised feature learning framework for 3D medical image analysis , 2020, Medical Image Anal..

[8]  Alexei A. Efros,et al.  What Should Not Be Contrastive in Contrastive Learning , 2020, ICLR.

[9]  Richard C. Pais,et al.  The Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI): a completed reference database of lung nodules on CT scans. , 2011, Medical physics.

[10]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[11]  Ming Yang,et al.  A Survey of Multi-View Representation Learning , 2019, IEEE Transactions on Knowledge and Data Engineering.

[12]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[13]  Kaiming He,et al.  Improved Baselines with Momentum Contrastive Learning , 2020, ArXiv.

[14]  Kaiming He,et al.  Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Jie Tang,et al.  Self-Supervised Learning: Generative or Contrastive , 2020, IEEE Transactions on Knowledge and Data Engineering.

[16]  J. Lee,et al.  Predicting What You Already Know Helps: Provable Self-Supervised Learning , 2020, NeurIPS.

[17]  Liang Chen,et al.  Self-supervised learning for medical image analysis using image context restoration , 2019, Medical Image Anal..

[18]  Wei Shen,et al.  Multi-crop Convolutional Neural Networks for lung nodule malignancy suspiciousness classification , 2017, Pattern Recognit..

[19]  Carlos Ferreira,et al.  LNDb: A Lung Nodule Database on Computed Tomography , 2019, ArXiv.

[20]  Matthijs Douze,et al.  Deep Clustering for Unsupervised Learning of Visual Features , 2018, ECCV.

[21]  Michal Valko,et al.  Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning , 2020, NeurIPS.

[22]  Nima Tajbakhsh,et al.  Models Genesis: Generic Autodidactic Models for 3D Medical Image Analysis , 2019, MICCAI.

[23]  Alexei A. Efros,et al.  Unsupervised Visual Representation Learning by Context Prediction , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[24]  Christoph Meinel,et al.  Deep Learning for Medical Image Analysis , 2018, Journal of Pathology Informatics.

[25]  Mingyuan Yang,et al.  A Survey of Multi-View Representation Learning , 2016, IEEE Transactions on Knowledge and Data Engineering.

[26]  Shiliang Sun,et al.  Multi-view learning overview: Recent progress and new challenges , 2017, Inf. Fusion.

[27]  Jianpeng Zhang,et al.  Semi-supervised adversarial model for benign-malignant lung nodule classification on chest CT , 2019, Medical Image Anal..

[28]  Yingli Tian,et al.  Self-Supervised Visual Feature Learning With Deep Neural Networks: A Survey , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Xinlei Chen,et al.  Exploring Simple Siamese Representation Learning , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Nikos Komodakis,et al.  Unsupervised Representation Learning by Predicting Image Rotations , 2018, ICLR.

[31]  Geoffrey E. Hinton,et al.  A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.

[32]  Qingtian Wu,et al.  Unsupervised Learning Based On Artificial Neural Network: A Review , 2018, 2018 IEEE International Conference on Cyborg and Bionic Systems (CBS).

[33]  Sergios Theodoridis,et al.  Advances in Machine Learning and Deep Neural Networks , 2021, Proc. IEEE.

[34]  Paolo Favaro,et al.  Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles , 2016, ECCV.

[35]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.