论文信息 - MiniNet: An Efficient Semantic Segmentation ConvNet for Real-Time Robotic Applications

MiniNet: An Efficient Semantic Segmentation ConvNet for Real-Time Robotic Applications

Efficient models for semantic segmentation, in terms of memory, speed, and computation, could boost many robotic applications with strong computational and temporal restrictions. This article presents a detailed analysis of different techniques for efficient semantic segmentation. Following this analysis, we have developed a novel architecture, MiniNet-v2, an enhanced version of MiniNet. MiniNet-v2 is built considering the best option depending on CPU or GPU availability. It reaches comparable accuracy to the state-of-the-art models but uses less memory and computational resources. We validate and analyze the details of our architecture through a comprehensive set of experiments on public benchmarks (Cityscapes, Camvid, and COCO-Text datasets), showing its benefits over relevant prior work. Our experiments include a sample application where these models can boost existing robotic applications.

[1] Sujith Ravi,et al. ProjectionNet: Learning Efficient On-Device Deep Networks Using Neural Projections , 2017, ArXiv.

[2] Timo Aila,et al. Pruning Convolutional Neural Networks for Resource Efficient Transfer Learning , 2016, ArXiv.

[3] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[4] George Papandreou,et al. Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation , 2018, ECCV.

[5] Yoshua Bengio,et al. BinaryConnect: Training Deep Neural Networks with binary weights during propagations , 2015, NIPS.

[6] Xiaogang Wang,et al. Pyramid Scene Parsing Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7] Jiashi Feng,et al. Partial Order Pruning: For Best Speed/Accuracy Trade-Off in Neural Architecture Search , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8] Eduardo Romera,et al. ERFNet: Efficient Residual Factorized ConvNet for Real-Time Semantic Segmentation , 2018, IEEE Transactions on Intelligent Transportation Systems.

[9] Ran El-Yaniv,et al. Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations , 2016, J. Mach. Learn. Res..

[10] Gang Yu,et al. BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation , 2018, ECCV.

[11] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12] Sheng Tang,et al. CGNet: A Light-Weight Context Guided Network for Semantic Segmentation , 2018, IEEE Transactions on Image Processing.

[13] Vladlen Koltun,et al. Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.

[14] Luis Riazuelo,et al. Enhancing V-SLAM Keyframe Selection with an Efficient ConvNet for Semantic Analysis , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[15] Luis Miguel Bergasa,et al. Train Here, Deploy There: Robust Segmentation in Unseen Domains , 2018, 2018 IEEE Intelligent Vehicles Symposium (IV).

[16] Linda G. Shapiro,et al. ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation , 2018, ECCV.

[17] Li Fei-Fei,et al. Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18] Raimondo Schettini,et al. Spatial Sampling Network for Fast Scene Understanding , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[19] Sebastian Ramos,et al. The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20] Xiaojuan Qi,et al. ICNet for Real-Time Semantic Segmentation on High-Resolution Images , 2017, ECCV.

[21] Wei Xiang,et al. ThunderNet: A Turbo Unified Network for Real-Time Semantic Segmentation , 2019, 2019 IEEE Winter Conference on Applications of Computer Vision (WACV).

[22] Roberto Cipolla,et al. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23] Jiri Matas,et al. COCO-Text: Dataset and Benchmark for Text Detection and Recognition in Natural Images , 2016, ArXiv.

[24] Eugenio Culurciello,et al. ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation , 2016, ArXiv.

[25] Roberto Cipolla,et al. Semantic object classes in video: A high-definition ground truth database , 2009, Pattern Recognit. Lett..

[26] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27] Thomas Brox,et al. U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[28] Yoshua Bengio,et al. The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).