论文信息 - Diff-UNet: A Diffusion Embedded Network for Volumetric Segmentation

Diff-UNet: A Diffusion Embedded Network for Volumetric Segmentation

In recent years, Denoising Diffusion Models have demonstrated remarkable success in generating semantically valuable pixel-wise representations for image generative modeling. In this study, we propose a novel end-to-end framework, called Diff-UNet, for medical volumetric segmentation. Our approach integrates the diffusion model into a standard U-shaped architecture to extract semantic information from the input volume effectively, resulting in excellent pixel-level representations for medical volumetric segmentation. To enhance the robustness of the diffusion model's prediction results, we also introduce a Step-Uncertainty based Fusion (SUF) module during inference to combine the outputs of the diffusion models at each step. We evaluate our method on three datasets, including multimodal brain tumors in MRI, liver tumors, and multi-organ CT volumes, and demonstrate that Diff-UNet outperforms other state-of-the-art methods significantly. Our experimental results also indicate the universality and effectiveness of the proposed model. The proposed framework has the potential to facilitate the accurate diagnosis and treatment of medical conditions by enabling more precise segmentation of anatomical structures. The codes of Diff-UNet are available at https://github.com/ge-xing/Diff-UNet

H. Fu | Lei Zhu | Guang Yang | Liang Wan | Zhao-Yang Xing

[1] Syed Waqas Zamir,et al. Transformers in Medical Imaging: A Survey , 2022, Medical Image Anal..

[2] Yehui Yang,et al. MedSegDiff: Medical Image Segmentation with Diffusion Probabilistic Model , 2022, MIDL.

[3] Philippe C. Cattin,et al. Diffusion Models for Implicit Image Segmentation Ensembles , 2021, MIDL.

[4] Bjoern H Menze,et al. The Medical Segmentation Decathlon , 2021, Nature Communications.

[5] Daguang Xu,et al. UNETR: Transformers for 3D Medical Image Segmentation , 2021, 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV).

[6] Holger Roth,et al. Swin UNETR: Swin Transformers for Semantic Segmentation of Brain Tumors in MRI Images , 2022, BrainLes@MICCAI.

[7] Wenxuan Wang,et al. TransBTS: Multimodal Brain Tumor Segmentation Using Transformer , 2021, MICCAI.

[8] Prafulla Dhariwal,et al. Improved Denoising Diffusion Probabilistic Models , 2021, ICML.

[9] S. Gelly,et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2020, ICLR.

[10] Jiaming Song,et al. Denoising Diffusion Implicit Models , 2020, ICLR.

[11] Zongwei Zhou,et al. Models Genesis. , 2020, Medical image analysis.

[12] Stephen Lin,et al. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[13] Pieter Abbeel,et al. Denoising Diffusion Probabilistic Models , 2020, NeurIPS.

[14] et al.,et al. Identifying the Best Machine Learning Algorithms for Brain Tumor Segmentation, Progression Assessment, and Overall Survival Prediction in the BRATS Challenge , 2018, ArXiv.

[15] Andriy Myronenko,et al. 3D MRI brain tumor segmentation using autoencoder regularization , 2018, BrainLes@MICCAI.

[16] Loïc Le Folgoc,et al. Attention U-Net: Learning Where to Look for the Pancreas , 2018, ArXiv.

[17] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.