Improve SegNet with feature pyramid for road scene parsing

Road scene parsing is a common task in semantic segmentation. Its images have characteristics of containing complex scene context and differing greatly among targets of the same category from different scales. To address these problems, we propose a semantic segmentation model combined with edge detection. We extend the segmentation network with an encoder-decoder structure by adding an edge feature pyramid module, namely Edge Feature Pyramid Network (EFPNet, for short). This module uses edge detection operators to get boundary information and then combines the multiscale features to improve the ability to recognize small targets. EFPNet can make up the shortcomings of convolutional neural network features, and it helps to produce smooth segmentation. After extracting features of the encoder and decoder, EFPNet uses Euclidean distance to compare the similarity between the presentation of the encoder and the decoder, which can increase the decoder’s ability to restore from the encoder. We evaluated the proposed method on Cityscapes datasets. The experiment on Cityscapes datasets demonstrates that the accuracies are improved by 7.5% and 6.2% over the popular SegNet and ENet. And the ablation experiment validates the effectiveness of our method.

[1]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[2]  Sebastian Ramos,et al.  The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Mark A. Kramer,et al.  Autoassociative neural networks , 1992 .

[4]  Eugenio Culurciello,et al.  ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation , 2016, ArXiv.

[5]  Xiaogang Wang,et al.  Pyramid Scene Parsing Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Dong Yang,et al.  An Alarm System for Segmentation Algorithm Based on Shape Model , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[8]  Sebastian Ruder,et al.  An Overview of Multi-Task Learning in Deep Neural Networks , 2017, ArXiv.

[9]  Vladlen Koltun,et al.  Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.

[10]  Ujjwal Maulik,et al.  Understanding Deep Learning Techniques for Image Segmentation , 2019, ACM Comput. Surv..

[11]  Luc Vincent,et al.  Watersheds in Digital Spaces: An Efficient Algorithm Based on Immersion Simulations , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Uwe Stilla,et al.  Classification With an Edge: Improving Semantic Image Segmentation with Boundary Detection , 2016, ISPRS Journal of Photogrammetry and Remote Sensing.

[13]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  John F. Canny,et al.  A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  George Papandreou,et al.  Rethinking Atrous Convolution for Semantic Image Segmentation , 2017, ArXiv.

[16]  Rolf Adams,et al.  Seeded Region Growing , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  George Papandreou,et al.  Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation , 2018, ECCV.

[18]  Qin Huang,et al.  Object Boundary Guided Semantic Segmentation , 2016, ACCV.

[19]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[20]  Shuo Wang,et al.  Overview of deep learning , 2016, 2016 31st Youth Academic Annual Conference of Chinese Association of Automation (YAC).

[21]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[22]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.