Momentum Contrastive Voxel-wise Representation Learning for Semi-supervised Volumetric Medical Image Segmentation

Automated segmentation in medical image analysis is a challenging task that requires a large amount of manually labeled data. However, manually annotating medical data is often laborious, and most existing learning-based approaches fail to accurately delineate object boundaries without effective geometric constraints. Contrastive learning, a sub-area of self-supervised learning, has recently been noted as a promising direction in multiple application fields. In this work, we present a novel Contrastive Voxel-wise Representation Distillation (CVRD) method with geometric constraints to learn global-local visual representations for volumetric medical image segmentation with limited annotations. Our framework can effectively learn global and local features by capturing 3D spatial context and rich anatomical information. Specifically, we introduce a voxel-to-volume contrastive algorithm to learn global information from 3D images, and propose to perform local voxel-to-voxel distillation to explicitly make use of local cues in the embedding space. Moreover, we integrate an elastic interaction-based active contour model as a geometric regularization term to enable fast and reliable object delineations in an end-to-end learning manner. Results on the Atrial Segmentation Challenge dataset demonstrate superiority of our proposed scheme, especially in a setting with a very limited number of annotated data. The code will be available at https://github.com/charlesyou999648/CVRD.

[1]  Long Cheng,et al.  Identification of Power Line Outages Based on PMU Measurements and Sparse Overcomplete Representation , 2016, 2016 IEEE 17th International Conference on Information Reuse and Integration (IRI).

[2]  Ge Wang,et al.  Super-resolution MRI through Deep Learning , 2018, 1810.06776.

[3]  Long Cheng,et al.  Body activity recognition using wearable sensors , 2017, 2017 Computing Conference.

[4]  Daniel Rueckert,et al.  Self-Supervised Learning for Cardiac MR Image Segmentation by Anatomical Position Prediction , 2019, MICCAI.

[5]  Marleen de Bruijne,et al.  Semi-supervised Medical Image Segmentation via Learning Consistency Under Transformations , 2019, MICCAI.

[6]  Jiasong Wu,et al.  DPA-DenseBiasNet: Semi-supervised 3D Fine Renal Artery Segmentation with Dense Biased Network and Deep Priori Anatomy , 2019, MICCAI.

[7]  Long Cheng,et al.  Hybrid non-linear dimensionality reduction method framework based on random projections , 2016, 2016 IEEE International Conference on Cloud Computing and Big Data Analysis (ICCCBDA).

[8]  Long Cheng,et al.  Random Projections for Non-linear Dimensionality Reduction , 2016 .

[9]  Oriol Vinyals,et al.  Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.

[10]  Laurens van der Maaten,et al.  Self-Supervised Learning of Pretext-Invariant Representations , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Geoffrey E. Hinton,et al.  A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.

[12]  Chenyu You,et al.  Semantic Transportation Prototypical Network for Few-Shot Intent Detection , 2021, Interspeech.

[13]  Timo Aila,et al.  Temporal Ensembling for Semi-Supervised Learning , 2016, ICLR.

[14]  Seyed-Ahmad Ahmadi,et al.  V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation , 2016, 2016 Fourth International Conference on 3D Vision (3DV).

[15]  Yuexian Zou,et al.  Knowledge Distillation for Improved Accuracy in Spoken Question Answering , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[16]  Ender Konukoglu,et al.  Contrastive learning of global and local features for medical image segmentation with limited annotations , 2020, NeurIPS.

[17]  Harri Valpola,et al.  Weight-averaged consistency targets improve semi-supervised deep learning results , 2017, ArXiv.

[18]  Yi Zhang,et al.  Low-Dose CT via Deep CNN with Skip Connection and Network in Network , 2018, Developments in X-Ray Tomography XII.

[19]  Guotai Wang,et al.  Learning Euler's Elastica Model for Medical Image Segmentation , 2020, ArXiv.

[20]  Hao Chen,et al.  Semi-supervised Skin Lesion Segmentation via Transformation Consistent Self-ensembling Model , 2018, BMVC.

[21]  Haoyu Ma,et al.  Good Students Play Big Lottery Better , 2021, ArXiv.

[22]  James S. Duncan,et al.  Unsupervised Wasserstein Distance Guided Domain Adaptation for 3D Multi-Domain Liver Segmentation , 2020, iMIMIC/MIL3iD/LABELS@MICCAI.

[23]  Punam K. Saha,et al.  Deep learning based high-resolution reconstruction of trabecular bone microstructures from low-resolution CT scans using GAN-CIRCLE , 2020, Medical Imaging: Biomedical Applications in Molecular, Structural, and Functional Imaging.

[24]  Fenglin Liu,et al.  Aligning Source Visual and Target Language Domains for Unpaired Video Captioning , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Yoshua Bengio,et al.  Learning deep representations by mutual information estimation and maximization , 2018, ICLR.

[26]  Kup-Sze Choi,et al.  Local and Global Structure-Aware Entropy Regularized Mean Teacher Model for 3D Left Atrium Segmentation , 2020, MICCAI.

[27]  Chenyu You,et al.  MRD-Net: Multi-Modal Residual Knowledge Distillation for Spoken Question Answering , 2021, IJCAI.

[28]  Chenyu You,et al.  Self-supervised Contrastive Cross-Modality Representation Learning for Spoken Question Answering , 2021, EMNLP.

[29]  Chi-Wing Fu,et al.  Uncertainty-aware Self-ensembling Model for Semi-supervised 3D Left Atrium Segmentation , 2019, MICCAI.

[30]  Ning Gu,et al.  A novel calibration method incorporating nonlinear optimization and ball‐bearing markers for cone‐beam CT with a parameterized trajectory , 2018, Medical physics.

[31]  Xuming He,et al.  Shape-aware Semi-supervised 3D Semantic Segmentation for Medical Images , 2020, MICCAI.

[32]  Ge Wang,et al.  Structurally-Sensitive Multi-Scale Deep Neural Network for Low-Dose CT Denoising , 2018, IEEE Access.

[33]  Demetri Terzopoulos,et al.  End-to-End Trainable Deep Active Contour Models for Automated Image Segmentation: Delineating Buildings in Aerial Imagery , 2020, ECCV.

[34]  Yong Yin,et al.  Shape-Aware Organ Segmentation by Predicting Signed Distance Maps , 2019, AAAI.

[35]  Nitish Srivastava Unsupervised Learning of Visual Representations using Videos , 2015 .

[36]  James S. Duncan,et al.  SimCVD: Simple Contrastive Voxel-Wise Representation Distillation for Semi-Supervised Medical Image Segmentation , 2021, ArXiv.

[37]  Yuexian Zou,et al.  Adaptive Bi-Directional Attention: Exploring Multi-Granularity Representations for Machine Reading Comprehension , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[38]  Yujiu Yang,et al.  Self-supervised Feature Learning for 3D Medical Images by Playing a Rubik's Cube , 2019, MICCAI.

[39]  Yuexian Zou,et al.  Self-supervised Dialogue Learning for Spoken Conversational Question Answering , 2021, Interspeech.

[40]  Alexei A. Efros,et al.  Unsupervised Visual Representation Learning by Context Prediction , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[41]  Yutaro Iwamoto,et al.  Semi-supervised Segmentation of Liver Using Adversarial Learning with Deep Atlas Prior , 2019, MICCAI.

[42]  Paolo Favaro,et al.  Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles , 2016, ECCV.

[43]  Xiaoping Yang,et al.  Learning Geodesic Active Contours for Embedding Object Global Information in Segmentation CNNs , 2020, IEEE Transactions on Medical Imaging.

[44]  Guang Li,et al.  CT Super-Resolution GAN Constrained by the Identical, Residual, and Cycle Learning Ensemble (GAN-CIRCLE) , 2018, IEEE Transactions on Medical Imaging.

[45]  Phillip Isola,et al.  Contrastive Multiview Coding , 2019, ECCV.

[46]  Bryan M. Williams,et al.  Learning Active Contour Models for Medical Image Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Chao Sun,et al.  sBiLSAN: Stacked Bidirectional Self-attention LSTM Network for Anomaly Detection and Diagnosis from System Logs , 2021, IntelliSys.

[48]  Yann LeCun,et al.  Dimensionality Reduction by Learning an Invariant Mapping , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[49]  Yaozong Gao,et al.  ASDNet: Attention Based Semi-supervised Deep Networks for Medical Image Segmentation , 2018, MICCAI.

[50]  Bin Dong,et al.  Deep Active Contour Network for Medical Image Segmentation , 2020, MICCAI.

[51]  Yuexian Zou,et al.  Contextualized Attention-based Knowledge Transfer for Spoken Conversational Question Answering , 2021, Interspeech 2021.

[52]  Christoph Lippert,et al.  3D Self-Supervised Methods for Medical Imaging , 2020, NeurIPS.

[53]  Lin Yang,et al.  Deep Adversarial Networks for Biomedical Image Segmentation Utilizing Unannotated Images , 2017, MICCAI.

[54]  Linfeng Yang,et al.  NuSeT: A deep learning tool for reliably separating and analyzing crowded cells , 2019, bioRxiv.

[55]  Shen Ge,et al.  Auto-Encoding Knowledge Graph for Unsupervised Medical Report Generation , 2021, NeurIPS.

[56]  Demetri Terzopoulos,et al.  Snakes: Active contour models , 2004, International Journal of Computer Vision.