Multi-scale Transformer Network with Edge-aware Pre-training for Cross-Modality MR Image Synthesis.

Cross-modality magnetic resonance (MR) image synthesis can be used to generate missing modalities from given ones. Existing (supervised learning) methods often require a large number of paired multi-modal data to train an effective synthesis model. However, it is often challenging to obtain sufficient paired data for supervised training. In reality, we often have a small number of paired data while a large number of unpaired data. To take advantage of both paired and unpaired data, in this paper, we propose a Multi-scale Transformer Network (MT-Net) with edge-aware pre-training for cross-modality MR image synthesis. Specifically, an Edge-preserving Masked AutoEncoder (Edge-MAE) is first pre-trained in a self-supervised manner to simultaneously perform 1) image imputation for randomly masked patches in each image and 2) whole edge map estimation, which effectively learns both contextual and structural information. Besides, a novel patch-wise loss is proposed to enhance the performance of Edge-MAE by treating different masked patches differently according to the difficulties of their respective imputations. Based on this proposed pre-training, in the subsequent fine-tuning stage, a Dual-scale Selective Fusion (DSF) module is designed (in our MT-Net) to synthesize missing-modality images by integrating multi-scale features extracted from the encoder of the pre-trained Edge-MAE. Furthermore, this pre-trained encoder is also employed to extract high-level features from the synthesized image and corresponding ground-truth image, which are required to be similar (consistent) in the training. Experimental results show that our MT-Net achieves comparable performance to the competing methods even using 70% of all available paired data. Our code will be released at https://github.com/lyhkevin/MT-Net.

[1]  H. Fu,et al.  Flexible Fusion Network for Multi-Modal Brain Tumor Segmentation , 2023, IEEE Journal of Biomedical and Health Informatics.

[2]  Chen Yang,et al.  Multi-scale feature pyramid fusion network for medical image segmentation , 2022, International Journal of Computer Assisted Radiology and Surgery.

[3]  G. Zaharchuk,et al.  One Model to Synthesize Them All: Multi-Contrast Multi-Scale Transformer for Missing Data Imputation , 2022, IEEE Transactions on Medical Imaging.

[4]  Syed Waqas Zamir,et al.  Transformers in Medical Imaging: A Survey , 2022, Medical Image Anal..

[5]  A. Yuille,et al.  Masked Feature Prediction for Self-Supervised Visual Pre-Training , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Ross B. Girshick,et al.  Masked Autoencoders Are Scalable Vision Learners , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Lingling Fang,et al.  Brain tumor segmentation based on the dual-path network of multi-modal MRI images , 2021, Pattern Recognit..

[8]  T. Çukur,et al.  ResViT: Residual Vision Transformers for Multimodal Medical Image Synthesis , 2021, IEEE Transactions on Medical Imaging.

[9]  Li Dong,et al.  BEiT: BERT Pre-Training of Image Transformers , 2021, ICLR.

[10]  Abhinav Sagar,et al.  DMSANet: Dual Multi Scale Attention Network , 2021, ICIAP.

[11]  Xinzi He,et al.  PTNet: A High-Resolution Infant MRI Synthesizer Based on Transformer , 2021, ArXiv.

[12]  Alper Yilmaz,et al.  Pocformer: A Lightweight Transformer Architecture For Detection Of Covid-19 Using Point Of Care Ultrasound , 2021, 2021 IEEE International Conference on Image Processing (ICIP).

[13]  Jia Wei,et al.  TarGAN: Target-Aware Generative Adversarial Networks for Multi-modality Medical Image Translation , 2021, MICCAI.

[14]  Qi Tian,et al.  Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation , 2021, ECCV Workshops.

[15]  Eric C. Frey,et al.  ViT-V-Net: Vision Transformer for Unsupervised Volumetric Medical Image Registration , 2021, ArXiv.

[16]  Dinggang Shen,et al.  Edge-preserving MRI image synthesis via adversarial network with iterative multi-scale fusion , 2021, Neurocomputing.

[17]  Yann LeCun,et al.  Barlow Twins: Self-Supervised Learning via Redundancy Reduction , 2021, ICML.

[18]  Yan Wang,et al.  TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation , 2021, ArXiv.

[19]  Shekoofeh Azizi,et al.  Big Self-Supervised Models Advance Medical Image Classification , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[20]  Gianmarco Santini,et al.  Unpaired PET/CT image synthesis of liver region using CycleGAN , 2020, Symposium on Medical Information Processing and Analysis.

[21]  S. Gelly,et al.  An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2020, ICLR.

[22]  J. Alison Noble,et al.  Self-Supervised Ultrasound to MRI Fetal Brain Image Synthesis , 2020, IEEE Transactions on Medical Imaging.

[23]  Mark Chen,et al.  Generative Pretraining From Pixels , 2020, ICML.

[24]  A. Thushara,et al.  Multimodal MRI Based Classification and Prediction of Alzheimer’s Disease Using Random Forest Ensemble , 2020, 2020 Advanced Computing and Communication Technologies for High Performance Applications (ACCTHPA).

[25]  B. Srinivas,et al.  Segmentation of Multi-Modal MRI Brain Tumor Sub-Regions Using Deep Learning , 2020, Journal of Electrical Engineering & Technology.

[26]  Yaozong Gao,et al.  Dual-Sampling Attention Network for Diagnosis of COVID-19 From Community Acquired Pneumonia , 2020, IEEE Transactions on Medical Imaging.

[27]  D.-P. Fan,et al.  Inf-Net: Automatic COVID-19 Lung Infection Segmentation From CT Images , 2020, IEEE Transactions on Medical Imaging.

[28]  Yi Pan,et al.  Enhancing the feature representation of multi-modal MRI data by combining multi-view information for MCI classification , 2020, Neurocomputing.

[29]  Shuai Wang,et al.  Synthesized 7T MRI from 3T MRI via deep learning in spatial and wavelet domains , 2020, Medical Image Anal..

[30]  Geoffrey E. Hinton,et al.  A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.

[31]  Ling Shao,et al.  Hi-Net: Hybrid-Fusion Network for Multi-Modal MR Image Synthesis , 2020, IEEE Transactions on Medical Imaging.

[32]  Yinghuan Shi,et al.  Sample-Adaptive GANs: Linking Global and Local Mappings for Cross-Modality MR Image Synthesis , 2020, IEEE Transactions on Medical Imaging.

[33]  Liang Chen,et al.  Self-supervised learning for medical image analysis using image context restoration , 2019, Medical Image Anal..

[34]  Ross B. Girshick,et al.  Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Rick Siow Mong Goh,et al.  Multi-Instance Multi-Scale CNN for Medical Image Classification , 2019, MICCAI.

[36]  Dinggang Shen,et al.  3D Auto-Context-Based Locality Adaptive Multi-Modality GANs for PET Synthesis , 2019, IEEE Transactions on Medical Imaging.

[37]  Dinggang Shen,et al.  Latent Representation Learning for Alzheimer’s Disease Diagnosis With Incomplete Multi-Modality Neuroimaging and Genetic Data , 2019, IEEE Transactions on Medical Imaging.

[38]  Xiang Zhou,et al.  Unpaired Mr to CT Synthesis with Explicit Structural Constrained Adversarial Learning , 2019, 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019).

[39]  Yinghuan Shi,et al.  Ea-GANs: Edge-Aware Generative Adversarial Networks for Cross-Modality MR Image Synthesis , 2019, IEEE Transactions on Medical Imaging.

[40]  Dinggang Shen,et al.  Interleaved 3D-CNNs for joint segmentation of small-volume structures in head and neck CT images. , 2018, Medical physics.

[41]  Dinggang Shen,et al.  Medical Image Synthesis with Deep Convolutional Adversarial Networks , 2018, IEEE Transactions on Biomedical Engineering.

[42]  Hayit Greenspan,et al.  Cross-Modality Synthesis from CT to PET using FCN and GAN Networks for Improved Automated Lesion Detection , 2018, Eng. Appl. Artif. Intell..

[43]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[44]  Liang Chen,et al.  Multi-modal classification of Alzheimer's disease using nonlinear graph fusion , 2017, Pattern Recognit..

[45]  Léon Bottou,et al.  Wasserstein GAN , 2017, ArXiv.

[46]  Su Ruan,et al.  Medical Image Synthesis with Context-Aware Generative Adversarial Networks , 2016, MICCAI.

[47]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  Paolo Favaro,et al.  Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles , 2016, ECCV.

[49]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[50]  Brian B. Avants,et al.  The Multimodal Brain Tumor Image Segmentation Benchmark (BRATS) , 2015, IEEE Transactions on Medical Imaging.

[51]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[52]  Aaron C. Courville,et al.  Generative Adversarial Nets , 2014, NIPS.

[53]  Daoqiang Zhang,et al.  Hierarchical fusion of features and classifier decisions for Alzheimer's disease diagnosis , 2014, Human brain mapping.

[54]  Dinggang Shen,et al.  Altered Structural Connectivity in Neonates at Genetic Risk for Schizophrenia: a Combined Study Using Morphological and White Matter Networks , 2022 .

[55]  Dinggang Shen,et al.  Iterative multi-atlas-based multi-image segmentation with tree-based registration , 2012, NeuroImage.

[56]  Dinggang Shen,et al.  Multivariate examination of brain abnormality using both structural and functional MRI , 2007, NeuroImage.

[57]  John F. Canny,et al.  A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[58]  et al.,et al.  ISLES 2015 ‐ A public evaluation benchmark for ischemic stroke lesion segmentation from multispectral MRI , 2017, Medical Image Anal..