FDDWNet: A Lightweight Convolutional Neural Network for Real-Time Semantic Segmentation

This paper introduces a lightweight convolutional neural network, called FDDWNet, for real-time accurate semantic segmentation. In contrast to recent advances of lightweight networks that prefer to utilize shallow structure, FDDWNet makes an effort to design more deeper network architecture, while maintains faster inference speed and higher segmentation accuracy. Our network uses factorized dilated depth-wise separable convolutions (FDDWC) to learn feature representations from different scale receptive fields with fewer model parameters. Additionally, FDDWNet has multiple branches of skipped connections to gather context cues from intermediate convolution layers. The experiments show that FDDWNet only has 0.8M model size, while achieves 60 FPS running speed on a single RTX 2080Ti GPU with a 1024 × 512 input image. The comprehensive experiments demonstrate that our model achieves state-of-the-art results in terms of available speed and accuracy trade-off on CityScapes and CamVid datasets.

[1]  Chong Li,et al.  Constrained Optimization Based Low-Rank Approximation of Deep Neural Networks , 2018, ECCV.

[2]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Sheng Tang,et al.  CGNet: A Light-Weight Context Guided Network for Semantic Segmentation , 2018, IEEE Transactions on Image Processing.

[5]  Eduardo Romera,et al.  ERFNet: Efficient Residual Factorized ConvNet for Real-Time Semantic Segmentation , 2018, IEEE Transactions on Intelligent Transportation Systems.

[6]  Xiangyu Zhang,et al.  ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design , 2018, ECCV.

[7]  Roberto Cipolla,et al.  Fast-SCNN: Fast Semantic Segmentation Network , 2019, BMVC.

[8]  Xiaojuan Qi,et al.  ICNet for Real-Time Semantic Segmentation on High-Resolution Images , 2017, ECCV.

[9]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Yiran Chen,et al.  Learning Structured Sparsity in Deep Neural Networks , 2016, NIPS.

[11]  Gen Li,et al.  DABNet: Depth-wise Asymmetric Bottleneck for Real-time Semantic Segmentation , 2019, BMVC.

[12]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[13]  Xiangyu Zhang,et al.  ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[14]  Wenfu Wang,et al.  DSNet for Real-Time Driving Scene Semantic Segmentation , 2018, ArXiv.

[15]  Ian D. Reid,et al.  RefineNet: Multi-path Refinement Networks for High-Resolution Semantic Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Roberto Cipolla,et al.  Segmentation and Recognition Using Structure from Motion Point Clouds , 2008, ECCV.

[17]  Jun Fu,et al.  Dual Attention Network for Scene Segmentation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Ran El-Yaniv,et al.  Binarized Neural Networks , 2016, NIPS.

[19]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[20]  Sebastian Ramos,et al.  The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Mark Sandler,et al.  MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[22]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Xiaogang Wang,et al.  Pyramid Scene Parsing Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Linda G. Shapiro,et al.  ESPNetv2: A Light-Weight, Power Efficient, and General Purpose Convolutional Neural Network , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Ali Farhadi,et al.  XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks , 2016, ECCV.

[26]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[27]  Eugenio Culurciello,et al.  ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation , 2016, ArXiv.

[28]  Léon Bottou,et al.  Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.