LMFFNet: A Well-Balanced Lightweight Network for Fast and Accurate Semantic Segmentation

Real-time semantic segmentation is widely used in autonomous driving and robotics. Most previous networks achieved great accuracy based on a complicated model involving mass computing. The existing lightweight networks generally reduce the parameter sizes by sacrificing the segmentation accuracy. It is critical to balance the parameters and accuracy for real-time semantic segmentation. In this article, we propose a lightweight multiscale-feature-fusion network (LMFFNet) mainly composed of three types of components: split-extract-merge bottleneck (SEM-B) block, feature fusion module (FFM), and multiscale attention decoder (MAD), where the SEM-B block extracts sufficient features with fewer parameters. FFMs fuse multiscale semantic features to effectively improve the segmentation accuracy and the MAD well recovers the details of the input images through the attention mechanism. Without pretraining, LMFFNet-3-8 achieves 75.1% mean intersection over union (mIoU) with 1.4 M parameters at 118.9 frames/s using RTX 3090 GPU. More experiments are investigated extensively on various resolutions on other three datasets of CamVid, KITTI, and WildDash2. The experiments verify that the proposed LMFFNet model makes a decent tradeoff between segmentation accuracy and inference speed for real-time tasks. The source code is publicly available at https://github.com/Greak-1124/LMFFNet.

[1]  Min Xia,et al.  SGBNet: An Ultra Light-weight Network for Real-time Semantic Segmentation of Land Cover , 2022, International Journal of Remote Sensing.

[2]  Yiqing Shi,et al.  RELAXNet: Residual efficient learning and attention expected fusion network for real-time semantic segmentation , 2021, Neurocomputing.

[3]  Dongbing Gu,et al.  LRDNet: A lightweight and efficient network with refined dual attention decorder for real-time semantic segmentation , 2021, Neurocomputing.

[4]  Yaru Zhang,et al.  Real-time semantic segmentation with weighted factorized-depthwise convolution , 2021, Image Vis. Comput..

[5]  Qingmin Liao,et al.  EFRNet: A Lightweight Network with Efficient Feature Fusion and Refinement for Real-Time Semantic Segmentation , 2021, 2021 IEEE International Conference on Multimedia and Expo (ICME).

[6]  Kai Ma,et al.  Lightweight and efficient asymmetric network design for real-time semantic segmentation , 2021, Appl. Intell..

[7]  Zhenhua Chai,et al.  Rethinking BiSeNet For Real-time Semantic Segmentation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Dong Yue,et al.  MSCFNet: A Lightweight Network With Multi-Scale Context Fusion for Real-Time Semantic Segmentation , 2021, IEEE Transactions on Intelligent Transportation Systems.

[9]  Huijun Gao,et al.  YolTrack: Multitask Learning Based Real-Time Multiobject Tracking and Segmentation for Autonomous Vehicles , 2021, IEEE Transactions on Neural Networks and Learning Systems.

[10]  Xiaojie Guo,et al.  Bilateral attention decoder: A lightweight decoder for real-time semantic segmentation , 2021, Neural Networks.

[11]  Tal Hassner,et al.  HyperSeg: Patch-wise Hypernetwork for Real-time Semantic Segmentation , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Quan Zhou,et al.  AGLNet: Towards real-time semantic segmentation of self-driving images via attention-guided lightweight network , 2020, Appl. Soft Comput..

[13]  Jing Liu,et al.  Scene Segmentation With Dual Relation-Aware Attention Network , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[14]  Hongtao Lu,et al.  LRNNET: A Light-Weighted Network with Efficient Reduced Non-Local Operation for Real-Time Semantic Segmentation , 2020, 2020 IEEE International Conference on Multimedia & Expo Workshops (ICMEW).

[15]  Gang Yu,et al.  BiSeNet V2: Bilateral Network with Guided Aggregation for Real-Time Semantic Segmentation , 2020, International Journal of Computer Vision.

[16]  Zewen Li,et al.  A Survey of Convolutional Neural Networks: Analysis, Applications, and Prospects , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[17]  Mengyu Liu,et al.  Feature Pyramid Encoding Network for Real-time Semantic Segmentation , 2019, BMVC.

[18]  Yu Wang,et al.  ESNet: An Efficient Symmetric Network for Real-time Semantic Segmentation , 2019, PRCV.

[19]  Yu Wang,et al.  Lednet: A Lightweight Encoder-Decoder Network for Real-Time Semantic Segmentation , 2019, 2019 IEEE International Conference on Image Processing (ICIP).

[20]  Jian Sun,et al.  DFANet: Deep Feature Aggregation for Real-Time Semantic Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Siniša Šegvić,et al.  In Defense of Pre-Trained ImageNet Architectures for Real-Time Semantic Segmentation of Road-Driving Images , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Roberto Cipolla,et al.  Fast-SCNN: Fast Semantic Segmentation Network , 2019, BMVC.

[23]  Juntang Zhuang,et al.  ShelfNet for Fast Semantic Segmentation , 2018, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[24]  Sheng Tang,et al.  CGNet: A Light-Weight Context Guided Network for Semantic Segmentation , 2018, IEEE Transactions on Image Processing.

[25]  Hsueh-Ming Hang,et al.  Efficient Dense Modules of Asymmetric Convolution for Real-Time Semantic Segmentation , 2018, MMAsia.

[26]  Oliver Zendel,et al.  WildDash - Creating Hazard-Aware Benchmarks , 2018, ECCV.

[27]  Gang Yu,et al.  BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation , 2018, ECCV.

[28]  Xiangyu Zhang,et al.  ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design , 2018, ECCV.

[29]  Davide Mazzini,et al.  Guided Upsampling Network for Real-Time Semantic Segmentation , 2018, BMVC.

[30]  Christopher Zach,et al.  ContextNet: Exploring Context and Detail for Semantic Segmentation in Real-time , 2018, BMVC.

[31]  Hanqing Lu,et al.  Collaborative Deconvolutional Neural Networks for Joint Depth Estimation and Semantic Segmentation , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[32]  Linda G. Shapiro,et al.  ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation , 2018, ECCV.

[33]  George Papandreou,et al.  Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation , 2018, ECCV.

[34]  George Papandreou,et al.  Rethinking Atrous Convolution for Semantic Image Segmentation , 2017, ArXiv.

[35]  Xiaojuan Qi,et al.  ICNet for Real-Time Semantic Segmentation on High-Resolution Images , 2017, ECCV.

[36]  Xiaogang Wang,et al.  Pyramid Scene Parsing Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Sepp Hochreiter,et al.  Speeding up Semantic Segmentation for Autonomous Driving , 2016 .

[38]  François Chollet,et al.  Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Mehryar Mohri,et al.  AdaNet: Adaptive Structural Learning of Artificial Neural Networks , 2016, ICML.

[41]  Eugenio Culurciello,et al.  ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation , 2016, ArXiv.

[42]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Sebastian Ramos,et al.  The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[47]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[48]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[49]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[50]  Trevor Darrell,et al.  Fully convolutional networks for semantic segmentation , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[52]  Andreas Geiger,et al.  Vision meets robotics: The KITTI dataset , 2013, Int. J. Robotics Res..

[53]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[54]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[55]  Roberto Cipolla,et al.  Semantic object classes in video: A high-definition ground truth database , 2009, Pattern Recognit. Lett..

[56]  Zhiyuan Xu,et al.  Bridging the Gap Between Semantic Segmentation and Instance Segmentation , 2022, IEEE Transactions on Multimedia.

[57]  Eduardo Romera,et al.  ERFNet: Efficient Residual Factorized ConvNet for Real-Time Semantic Segmentation , 2018, IEEE Transactions on Intelligent Transportation Systems.