论文信息 - An Efficient Semantic Segmentation Method using Pyramid ShuffleNet V2 with Vortex Pooling

An Efficient Semantic Segmentation Method using Pyramid ShuffleNet V2 with Vortex Pooling

Efficient and accurate semantic segmentation is particularly important especially for applications like autonomous driving which requires real-time inference speed and high performance. Many works try to compromise spatial resolution to achieve real-time inference speed, which leads to poor performance. As a result, real-time segmentation task for embedded devices is still an open problem. In this paper, we focus on building a network with better performance possible while still achieve real-time inference speed. We first use a pyramid kernel size to capture more spatial information instead of using just a 3×3 kernel size for DWConvolution in ShuffleNet v2. Meanwhile, an efficient Vortex Pooling module is employed to aggregate the contextual information and generate high-resolution features. Compared with other state-of-the-art real-time semantic segmentation networks, the proposed network achieves similar inference speed and better performance on embedded device. Specifically, we achieve state-of-the-art 73.46% mean IoU on Cityscapes test dataset, for a 768×1024 input, a speed of 46.1 frames per second on NVIDIA Jetson AGX Xavier embedded development board is achieved.

Lin Li | Jingling Yuan | Jiansheng Dong | Xian Zhong | Weiru Liu

[1] Janne Heikkila,et al. An efficient solution for semantic segmentation: ShuffleNet V2 with atrous separable convolutions , 2019, SCIA.

[2] Garrison W. Cottrell,et al. Understanding Convolution for Semantic Segmentation , 2017, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[3] Eduardo Romera,et al. ERFNet: Efficient Residual Factorized ConvNet for Real-Time Semantic Segmentation , 2018, IEEE Transactions on Intelligent Transportation Systems.

[4] Song Han,et al. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[5] Davide Mazzini,et al. Guided Upsampling Network for Real-Time Semantic Segmentation , 2018, BMVC.

[6] Thomas Brox,et al. U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[7] Vladlen Koltun,et al. Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.

[8] Seunghoon Hong,et al. Learning Deconvolution Network for Semantic Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[9] Jianxin Wu,et al. Vortex Pooling: Improving Context Representation in Semantic Segmentation , 2018, ArXiv.

[10] Eugenio Culurciello,et al. ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation , 2016, ArXiv.

[11] Mennatullah Siam,et al. ShuffleSeg: Real-time Semantic Segmentation Network , 2018, ArXiv.

[12] Anton van den Hengel,et al. Real-time Semantic Image Segmentation via Spatial Sparsity , 2017, ArXiv.

[13] Bo Chen,et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[14] George Papandreou,et al. Rethinking Atrous Convolution for Semantic Image Segmentation , 2017, ArXiv.

[15] Thomas A. Funkhouser,et al. Dilated Residual Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16] Mark Sandler,et al. MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[17] Sebastian Ramos,et al. The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18] Roberto Cipolla,et al. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[20] Xiangyu Zhang,et al. ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[21] Christopher Zach,et al. ContextNet: Exploring Context and Detail for Semantic Segmentation in Real-time , 2018, BMVC.

[22] Xiangyu Zhang,et al. ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design , 2018, ECCV.

[23] Xiaogang Wang,et al. Pyramid Scene Parsing Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24] Charless C. Fowlkes,et al. Laplacian Pyramid Reconstruction and Refinement for Semantic Segmentation , 2016, ECCV.

[25] Roberto Cipolla,et al. Fast-SCNN: Fast Semantic Segmentation Network , 2019, BMVC.

[26] Xiaojuan Qi,et al. ICNet for Real-Time Semantic Segmentation on High-Resolution Images , 2017, ECCV.

[27] Trevor Darrell,et al. Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28] François Chollet,et al. Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29] Gang Yu,et al. BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation , 2018, ECCV.

[30] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.