论文信息 - Towards Learning a Self-inverse Network for Bidirectional Image-to-image Translation

Towards Learning a Self-inverse Network for Bidirectional Image-to-image Translation

The one-to-one mapping is necessary for many bidirectional image-to-image translation applications, such as MRI image synthesis as MRI images are unique to the patient. State-of-the-art approaches for image synthesis from domain X to domain Y learn a convolutional neural network that meticulously maps between the domains. A different network is typically implemented to map along the opposite direction, from Y to X. In this paper, we explore the possibility of only wielding one network for bi-directional image synthesis. In other words, such an autonomous learning network implements a self-inverse function. A self-inverse network shares several distinct advantages: only one network instead of two, better generalization and more restricted parameter space. Most importantly, a self-inverse function guarantees a one-to-one mapping, a property that cannot be guaranteed by earlier approaches that are not self-inverse. The experiments on three datasets show that, compared with the baseline approaches that use two separate models for the image synthesis along two directions, our self-inverse network achieves better synthesis results in terms of standard metrics. Finally, our sensitivity analysis confirms the feasibility of learning a self-inverse function for the bidirectional image translation.

[1] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[2] Wojciech Zaremba,et al. Improved Techniques for Training GANs , 2016, NIPS.

[3] Alexei A. Efros,et al. Toward Multimodal Image-to-Image Translation , 2017, NIPS.

[4] Jung-Woo Ha,et al. StarGAN: Unified Generative Adversarial Networks for Multi-domain Image-to-Image Translation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[5] Trevor Darrell,et al. Adversarial Feature Learning , 2016, ICLR.

[6] Dinggang Shen,et al. Ultra-Fast T2-Weighted MR Reconstruction Using Complementary T1-Weighted Information , 2018, MICCAI.

[7] Ullrich Köthe,et al. Analyzing Inverse Problems with Invertible Neural Networks , 2018, ICLR.

[8] Prafulla Dhariwal,et al. Glow: Generative Flow with Invertible 1x1 Convolutions , 2018, NeurIPS.

[9] Andrea Vedaldi,et al. Visualizing Deep Convolutional Neural Networks Using Natural Pre-images , 2015, International Journal of Computer Vision.

[10] Simon Osindero,et al. Conditional Generative Adversarial Nets , 2014, ArXiv.

[11] Zhiqiang Shen,et al. Towards Instance-Level Image-To-Image Translation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12] Ling Shao,et al. Simultaneous Super-Resolution and Cross-Modality Synthesis of 3D Medical Images Using Weakly-Supervised Joint Convolutional Sparse Coding , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13] Vladlen Koltun,et al. Photographic Image Synthesis with Cascaded Refinement Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[14] Li Fei-Fei,et al. Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[15] Arnold W. M. Smeulders,et al. i-RevNet: Deep Invertible Networks , 2018, ICLR.

[16] Mert R. Sabuncu,et al. Anatomical Priors in Convolutional Networks for Unsupervised Biomedical Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[17] Thomas Brox,et al. Inverting Visual Representations with Convolutional Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18] Fisher Yu,et al. Scribbler: Controlling Deep Image Synthesis with Sketch and Color , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19] Alexei A. Efros,et al. Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20] Leon A. Gatys,et al. Image Style Transfer Using Convolutional Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21] Eero P. Simoncelli,et al. Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[22] Yu-Ding Lu,et al. DRIT++: Diverse Image-to-Image Translation via Disentangled Representations , 2020, International Journal of Computer Vision.

[23] Thomas Brox,et al. U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[24] Andrew Owens,et al. Visually Indicated Sounds , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25] Alexei A. Efros,et al. Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[26] Jan Kautz,et al. High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[27] Aaron Carass,et al. Using image synthesis for multi-channel registration of different image modalities , 2015, Medical Imaging.

[28] Brian B. Avants,et al. The Multimodal Brain Tumor Image Segmentation Benchmark (BRATS) , 2015, IEEE Transactions on Medical Imaging.

[29] Aykut Erdem,et al. Learning to Generate Images of Outdoor Scenes from Attributes and Semantic Layouts , 2016, ArXiv.

[30] Maneesh Kumar Singh,et al. DRIT++: Diverse Image-to-Image Translation via Disentangled Representations , 2019, International Journal of Computer Vision.

[31] Moustapha Cissé,et al. Parseval Networks: Improving Robustness to Adversarial Examples , 2017, ICML.

[32] Jan Kautz,et al. Multimodal Unsupervised Image-to-Image Translation , 2018, ECCV.

[33] David Salesin,et al. Image Analogies , 2001, SIGGRAPH.

[34] Abhinav Gupta,et al. Generative Image Modeling Using Style and Structure Adversarial Networks , 2016, ECCV.

[35] Andrea Vedaldi,et al. Texture Networks: Feed-forward Synthesis of Textures and Stylized Images , 2016, ICML.

[36] Jan Kautz,et al. Unsupervised Image-to-Image Translation Networks , 2017, NIPS.

[37] Philip Bachman,et al. Augmented CycleGAN: Learning Many-to-Many Mappings from Unpaired Data , 2018, ICML.

[38] Alexei A. Efros,et al. Colorful Image Colorization , 2016, ECCV.

[39] Djemel Ziou,et al. Image Quality Metrics: PSNR vs. SSIM , 2010, 2010 20th International Conference on Pattern Recognition.

[40] Sebastian Ramos,et al. The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41] Joan Bruna,et al. Signal recovery from Pooling Representations , 2013, ICML.