J-Net: Asymmetric Encoder-Decoder for Medical Semantic Segmentation

With the development of deep learning, breakthroughs have been made in the field of semantic segmentation. However, it is difficult to generate a fine mask on the same medical images because medical images have low contrast, high resolution, and insufficient semantic information. In most scenarios, existing approaches mostly use a pooling layer to reduce the resolution of feature maps. Therefore, it is difficult for them to consider the whole image features, resulting in information loss and performance degradation. In this paper, a multiscale asymmetric encoder-decoder semantic segmentation network is proposed. The network consists of two parts, which perform feature extraction and image restoration on the input, respectively. The encoder network obtains multiscale feature information by connecting multiple ASPP modules to form a feature pyramid. Meanwhile, the upsampling layer of each decoder can be connected to the feature map generated by the corresponding ASPP module. Finally, the classification information of each pixel is obtained through the sigmoid function. The performance of the proposed method can be verified on publicly available datasets. The experimental evidence shows that the proposed method can take full advantage of multiscale feature information and achieve superior performance with less inference computational cost.

[1]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Quoc V. Le,et al.  NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Jinhui Tang,et al.  Feature Pyramid Transformer , 2020, ECCV.

[4]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[5]  Quan Zhou,et al.  AGLNet: Towards real-time semantic segmentation of self-driving images via attention-guided lightweight network , 2020, Appl. Soft Comput..

[6]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[7]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Songtao Liu,et al.  Learning Spatial Fusion for Single-Shot Object Detection , 2019, ArXiv.

[10]  George Papandreou,et al.  Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation , 2018, ECCV.

[11]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[12]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[14]  Xiangyu Zhang,et al.  You Only Look One-level Feature , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Bingbing Ni,et al.  Scale-Transferrable Object Detection , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[16]  Seyed-Ahmad Ahmadi,et al.  V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation , 2016, 2016 Fourth International Conference on 3D Vision (3DV).

[17]  Stephen Lin,et al.  Swin Transformer: Hierarchical Vision Transformer using Shifted Windows , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[18]  Alan Yuille,et al.  DetectoRS: Detecting Objects with Recursive Feature Pyramid and Switchable Atrous Convolution , 2020, ArXiv.

[19]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[20]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[21]  Allan Hanbury,et al.  Metrics for evaluating 3D medical image segmentation: analysis, selection, and tool , 2015, BMC Medical Imaging.

[22]  Yunhong Wang,et al.  Receptive Field Block Net for Accurate and Fast Object Detection , 2017, ECCV.

[23]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Xin Jin,et al.  RSANet: Towards Real-Time Object Detection with Residual Semantic-Guided Attention Feature Pyramid Network , 2021, Mob. Networks Appl..

[26]  Raquel Urtasun,et al.  Understanding the Effective Receptive Field in Deep Convolutional Neural Networks , 2016, NIPS.

[27]  Quoc V. Le,et al.  EfficientDet: Scalable and Efficient Object Detection , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Ying Chen,et al.  M2Det: A Single-Shot Object Detector based on Multi-Level Feature Pyramid Network , 2018, AAAI.

[30]  George Papandreou,et al.  Rethinking Atrous Convolution for Semantic Image Segmentation , 2017, ArXiv.