Robust semantic segmentation based on RGB-thermal in variable lighting scenes

Abstract Semantic segmentation is an indispensable part of the Intelligent Vehicle-Infrastructure System(IVIS) perception task. It has made considerable progress with the development of convolutional neural networks. However, the current mainstream semantic segmentation networks are designed for 3-channel RGB images captured by visible light cameras, and their accuracy and robustness are still insufficient in variable lighting environments. Therefore, this paper proposes a novel network architecture with RGB and thermal data fusion to improve the accuracy and robustness of semantic segmentation in a variable lighting environment. The experimental results verify that introducing thermal data significantly improves the adaptability and segmentation accuracy of the network to variable lighting environments. Moreover, the results demonstrate that our network is accurate and robust under variable lighting environments, and its overall performance outperforms the state-of-the-art networks.

[1]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[2]  Sheng Liu,et al.  Lane departure warning systems and lane line detection methods based on image processing and semantic segmentation: A review , 2020 .

[3]  Amit Kumar Jaiswal,et al.  Identifying pneumonia in chest X-rays: A deep learning approach , 2019, Measurement.

[4]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Horst-Michael Groß,et al.  Efficient RGB-D Semantic Segmentation for Indoor Scene Analysis , 2020, 2021 IEEE International Conference on Robotics and Automation (ICRA).

[6]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[7]  Rob Fergus,et al.  Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-scale Convolutional Architecture , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[8]  Michael Vollmer,et al.  Infrared Thermal Imaging: Fundamentals, Research and Applications , 2010 .

[9]  Enhua Wu,et al.  Squeeze-and-Excitation Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Lei Luo,et al.  Pixel-level pavement crack segmentation with encoder-decoder network , 2021 .

[11]  Syed Ali Tariq,et al.  Multi-Feature View-Based Shallow Convolutional Neural Network for Road Segmentation , 2020, IEEE Access.

[12]  Li Zhang,et al.  Roadside Unit Deployment of Cooperative Vehicle-Infrastructure System Based on Digital Measurable Image Method , 2020, Journal of Physics: Conference Series.

[13]  Vincent Dumoulin,et al.  Deconvolution and Checkerboard Artifacts , 2016 .

[14]  Dedong Yang,et al.  PRDNet: Medical image segmentation based on parallel residual and dilated network , 2020 .

[15]  Yifei Zhang,et al.  Deep multimodal fusion for semantic image segmentation: A survey , 2020, Image Vis. Comput..

[16]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[17]  Qian Xie,et al.  Automatic defect detection and segmentation of tunnel surface using modified Mask R-CNN , 2021 .

[18]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Prabhakar S Manage,et al.  Road segmentation for autonomous vehicle: A review , 2020, 2020 3rd International Conference on Intelligent Sustainable Systems (ICISS).

[20]  Wei Ren,et al.  Automatic recognition and analysis system of asphalt pavement cracks using interleaved low-rank group convolution hybrid deep network and SegNet fusing dense condition random field , 2021 .

[21]  Tatsuya Harada,et al.  MFNet: Towards real-time semantic segmentation for autonomous vehicles with multi-spectral scenes , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[22]  Xingchen Zhang,et al.  DSiamMFT: An RGB-T fusion tracking method via dynamic Siamese networks using multi-layer feature fusion , 2020, Signal Process. Image Commun..

[23]  Yuxiang Sun,et al.  RTFNet: RGB-Thermal Fusion Network for Semantic Segmentation of Urban Scenes , 2019, IEEE Robotics and Automation Letters.

[24]  Rui Fan,et al.  SNE-RoadSeg: Incorporating Surface Normal Information into Semantic Segmentation for Accurate Freespace Detection , 2020, ECCV.

[25]  Xiaogang Wang,et al.  Pyramid Scene Parsing Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Eric W.T. Ngai,et al.  Deep learning in computer vision: A critical review of emerging techniques and application scenarios , 2021 .

[27]  Vladlen Koltun,et al.  Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.

[28]  Infrared Thermal Imaging , 2020 .

[29]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Gang Li,et al.  Pixel-level bridge crack detection using a deep fusion about recurrent residual convolution and context encoder network , 2021 .

[31]  Eduardo Romera,et al.  ERFNet: Efficient Residual Factorized ConvNet for Real-Time Semantic Segmentation , 2018, IEEE Transactions on Intelligent Transportation Systems.

[32]  Daniel Cremers,et al.  FuseNet: Incorporating Depth into Semantic Segmentation via Fusion-Based CNN Architecture , 2016, ACCV.

[33]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Jianya Gong,et al.  Learning deep cross-scale feature propagation for indoor semantic segmentation , 2021 .