Polarization-driven Semantic Segmentation via Efficient Attention-bridged Fusion

Semantic Segmentation (SS) is promising for outdoor scene perception in safety-critical applications like autonomous vehicles, assisted navigation and so on. However, traditional SS is primarily based on RGB images, which limits the reliability of SS in complex outdoor scenes, where RGB images lack necessary information dimensions to fully perceive unconstrained environments. As preliminary investigation, we examine SS in an unexpected obstacle detection scenario, which demonstrates the necessity of multimodal fusion. Thereby, in this work, we present EAFNet, an Efficient Attention-bridged Fusion Network to exploit complementary information coming from different optical sensors. Specifically, we incorporate polarization sensing to obtain supplementary information, considering its optical characteristics for robust representation of diverse materials. By using a single-shot polarization sensor, we build the first RGB-P dataset which consists of 394 annotated pixel-aligned RGB-Polarization images. A comprehensive variety of experiments shows the effectiveness of EAFNet to fuse polarization and RGB information, as well as the flexibility to be adapted to other sensor combination scenarios.

[1]  Kailun Yang,et al.  A multimodal vision sensor for autonomous driving , 2019, Security + Defence.

[2]  Wolfram Burgard,et al.  HeatNet: Bridging the Day-Night Domain Gap in Semantic Segmentation with Thermal Images , 2020, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[3]  Jaegul Choo,et al.  Cars Can’t Fly Up in the Sky: Improving Urban-Scene Segmentation via Height-Driven Attention Networks , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Yifei Zhang,et al.  Outdoor Scenes Pixel-wise Semantic Segmentation using Polarimetry and Fully Convolutional Network , 2019, VISIGRAPP.

[5]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Klaus C. J. Dietmayer,et al.  Deep Multi-Modal Object Detection and Semantic Segmentation for Autonomous Driving: Datasets, Methods, and Challenges , 2019, IEEE Transactions on Intelligent Transportation Systems.

[8]  Siniša Šegvić,et al.  In Defense of Pre-Trained ImageNet Architectures for Real-Time Semantic Segmentation of Road-Driving Images , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Sebastian Ramos,et al.  Lost and Found: detecting small road hazards for self-driving vehicles , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[10]  Fan Wang,et al.  Multimodality semantic segmentation based on polarization and color images , 2017, Neurocomputing.

[11]  Lei Sun,et al.  Real-Time Fusion Network for RGB-D Semantic Segmentation Incorporating Unexpected Obstacle Detection for Road-Driving Images , 2020, IEEE Robotics and Automation Letters.

[12]  Xiaogang Wang,et al.  Pyramid Scene Parsing Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Ana Cristina Murillo,et al.  EV-SegNet: Semantic Segmentation for Event-Based Cameras , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[14]  Xiaogang Wang,et al.  Context Encoding for Semantic Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[15]  Jun Fu,et al.  Dual Attention Network for Scene Segmentation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Yifei Zhang,et al.  Exploration of Deep Learning-based Multimodal Fusion for Semantic Road Scene Segmentation , 2019, VISIGRAPP.

[17]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[18]  Jian Bai,et al.  Detecting Traversable Area and Water Hazards for the Visually Impaired with a pRGB-D Sensor , 2017, Sensors.

[19]  Larry H. Matthies,et al.  Depth from stereo polarization in specular scenes for urban robotics , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[20]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2015, IEEE Trans. Pattern Anal. Mach. Intell..

[21]  Sebastian Ramos,et al.  The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Peter Kontschieder,et al.  The Mapillary Vistas Dataset for Semantic Understanding of Street Scenes , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[23]  Enhua Wu,et al.  Squeeze-and-Excitation Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Yifei Zhang,et al.  Deep multimodal fusion for semantic image segmentation: A survey , 2020, Image Vis. Comput..

[25]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[26]  Qilong Wang,et al.  ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Luis Miguel Bergasa,et al.  Predicting Polarization Beyond Semantics for Wearable Robotics , 2018, 2018 IEEE-RAS 18th International Conference on Humanoid Robots (Humanoids).

[28]  Wolfram Burgard,et al.  Deep Multispectral Semantic Scene Understanding of Forested Environments Using Multimodal Fusion , 2016, ISER.

[29]  In So Kweon,et al.  RANUS: RGB and NIR Urban Scene Dataset for Deep Scene Parsing , 2018, IEEE Robotics and Automation Letters.

[30]  Lei Sun,et al.  See clearer at night: towards robust nighttime semantic segmentation through day-night image conversion , 2019, Security + Defence.

[31]  Kailun Yang,et al.  A comparative study of high-recall real-time semantic segmentation based on swift factorized network , 2019, Security + Defence.

[32]  Kaite Xiang,et al.  Importance-Aware Semantic Segmentation with Efficient Pyramidal Context Network for Navigational Assistant Systems , 2019, 2019 IEEE Intelligent Transportation Systems Conference (ITSC).

[33]  Jian Sun,et al.  DFANet: Deep Feature Aggregation for Real-Time Semantic Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  Jingdong Wang,et al.  OCNet: Object Context Network for Scene Parsing , 2018, ArXiv.

[36]  Rainer Stiefelhagen,et al.  ISSAFE: Improving Semantic Segmentation in Accidents by Fusing Event-based Data , 2020, 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[37]  Antonio Torralba,et al.  LabelMe: A Database and Web-Based Tool for Image Annotation , 2008, International Journal of Computer Vision.

[38]  Yunchao Wei,et al.  CCNet: Criss-Cross Attention for Semantic Segmentation , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[39]  Chunhua Shen,et al.  Segmenting Transparent Objects in the Wild , 2020, ECCV.

[40]  Eduardo Romera,et al.  Robustifying semantic cognition of traversability across wearable RGB-depth cameras. , 2019, Applied optics.

[41]  Yan Yan,et al.  Segmenting Objects in Day and Night: Edge-Conditioned CNN for Thermal Image Semantic Segmentation , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[42]  Jian Bai,et al.  Target enhanced 3D reconstruction based on polarization-coded structured light. , 2017, Optics express.

[43]  Kailun Yang,et al.  Bridging the Day and Night Domain Gap for Semantic Segmentation , 2019, 2019 IEEE Intelligent Vehicles Symposium (IV).

[44]  Hao Chen,et al.  DS-PASS: Detail-Sensitive Panoramic Annular Semantic Segmentation through SwaftNet for Surrounding Sensing , 2020, 2020 IEEE Intelligent Vehicles Symposium (IV).

[45]  Shaodi You,et al.  Single Image Water Hazard Detection Using FCN with Reflection Attention Units , 2018, ECCV.

[46]  Xinxin Hu,et al.  ACNET: Attention Based Network to Exploit Complementary Features for RGBD Semantic Segmentation , 2019, 2019 IEEE International Conference on Image Processing (ICIP).

[47]  Luis Miguel Bergasa,et al.  Perception Framework of Water Hazards Beyond Traversability for Real-World Navigation Assistance Systems , 2018, 2018 IEEE International Conference on Robotics and Biomimetics (ROBIO).

[48]  Eduardo Romera,et al.  ERFNet: Efficient Residual Factorized ConvNet for Real-Time Semantic Segmentation , 2018, IEEE Transactions on Intelligent Transportation Systems.

[49]  Jian Bai,et al.  PALVO: visual odometry based on panoramic annular lens. , 2019, Optics express.

[50]  Ramesh Raskar,et al.  Deep Polarization Cues for Transparent Object Segmentation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).