DensePASS: Dense Panoramic Semantic Segmentation via Unsupervised Domain Adaptation with Attention-Augmented Context Exchange

Intelligent vehicles clearly benefit from the expanded Field of View (FoV) of the 360° sensors, but the vast majority of available semantic segmentation training images are captured with pinhole cameras. In this work, we look at this problem through the lens of domain adaptation and bring panoramic semantic segmentation to a setting, where labelled training data originates from a different distribution of conventional pinhole camera images. First, we formalize the task of unsupervised domain adaptation for panoramic semantic segmentation, where a network trained on labelled examples from the source domain of pinhole camera data is deployed in a different target domain of panoramic images, for which no labels are available. To validate this idea, we collect and publicly release Densepass - a novel densely annotated dataset for panoramic segmentation under cross-domain conditions, specifically built to study the Pinhole→panoramictransfer and accompanied with pinhole camera training examples obtained from Cityscapes. Densepass covers both, labelled- and unlabelled 360° images, with the labelled data comprising 19 classes which explicitly fit the categories available in the source domain (i.e. pinhole) data. To meet the challenge of domain shift, we leverage the current progress of attention-based mechanisms and build a generic framework for cross-domain panoramic semantic segmentation based on different variants of attention-augmented domain adaptation modules. Our framework facilitates information exchange at local- and global levels when learning the domain correspondences and improves the domain adaptation performance of two standard segmentation networks by 6.05% and 11.26% in Mean IoU.

[1]  Peilin Zhao,et al.  Context-Aware Domain Adaptation in Semantic Segmentation , 2020, 2021 IEEE Winter Conference on Applications of Computer Vision (WACV).

[2]  Kailun Yang,et al.  Bridging the Day and Night Domain Gap for Semantic Segmentation , 2019, 2019 IEEE Intelligent Vehicles Symposium (IV).

[3]  Rui Huang,et al.  AttaNet: Attention-Augmented Network for Fast and Accurate Scene Parsing , 2021, AAAI.

[4]  Sebastian Ramos,et al.  The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Ming Yang,et al.  Restricted Deformable Convolution-Based Road Scene Semantic Segmentation Using Surround View Cameras , 2018, IEEE Transactions on Intelligent Transportation Systems.

[6]  Oliver Zendel,et al.  WildDash - Creating Hazard-Aware Benchmarks , 2018, ECCV.

[7]  Kailun Yang,et al.  Universal Semantic Segmentation for Fisheye Urban Driving Images , 2020, 2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC).

[8]  Jun Fu,et al.  Dual Attention Network for Scene Segmentation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Hao Chen,et al.  DS-PASS: Detail-Sensitive Panoramic Annular Semantic Segmentation through SwaftNet for Surrounding Sensing , 2020, 2020 IEEE Intelligent Vehicles Symposium (IV).

[10]  Trevor Darrell,et al.  FCNs in the Wild: Pixel-level Adversarial and Constraint-based Adaptation , 2016, ArXiv.

[11]  Rainer Stiefelhagen,et al.  Capturing Omni-Range Context for Omnidirectional Segmentation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Kun Yu,et al.  DenseASPP for Semantic Segmentation in Street Scenes , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[14]  Rainer Stiefelhagen,et al.  Omnisupervised Omnidirectional Semantic Segmentation , 2020, IEEE Transactions on Intelligent Transportation Systems.

[15]  Eduardo Romera,et al.  ERFNet: Efficient Residual Factorized ConvNet for Real-Time Semantic Segmentation , 2018, IEEE Transactions on Intelligent Transportation Systems.

[16]  Senthil Yogamani,et al.  OmniDet: Surround View Cameras Based Multi-Task Visual Perception Network for Autonomous Driving , 2021, IEEE Robotics and Automation Letters.

[17]  Xiaogang Wang,et al.  Pyramid Scene Parsing Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  F. Cheriet,et al.  Adaptable Deformable Convolutions for Semantic Segmentation of Fisheye Images in Autonomous Driving Systems , 2021, ArXiv.

[19]  D. Manocha,et al.  SAfE: Self-Attention Based Unsupervised Road Safety Classification in Hazardous Environments , 2020, ArXiv.

[20]  Kaiming He,et al.  Panoptic Feature Pyramid Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Abhinav Gupta,et al.  Non-local Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[22]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[23]  Wei-Lun Chang,et al.  All About Structure: Adapting Structural Information Across Domains for Boosting Semantic Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Kate Saenko,et al.  Real-Time Semantic Segmentation With Fast Attention , 2020, IEEE Robotics and Automation Letters.

[25]  Rainer Stiefelhagen,et al.  Panoramic Panoptic Segmentation: Towards Complete Surrounding Understanding via Unsupervised Contrastive Learning , 2021, 2021 IEEE Intelligent Vehicles Symposium (IV).

[26]  Yi Wang,et al.  RANet: Region Attention Network for Semantic Segmentation , 2020, NeurIPS.

[27]  Fengmao Lv,et al.  Constructing Self-Motivated Pyramid Curriculums for Cross-Domain Semantic Segmentation: A Non-Adversarial Approach , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[28]  Chongruo Wu,et al.  ResNeSt: Split-Attention Networks , 2020, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[29]  Rainer Stiefelhagen,et al.  ISSAFE: Improving Semantic Segmentation in Accidents by Fusing Event-based Data , 2020, 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[30]  Kailun Yang,et al.  PASS: Panoramic Annular Semantic Segmentation , 2020, IEEE Transactions on Intelligent Transportation Systems.

[31]  Ming-Hsuan Yang,et al.  Learning to Adapt Structured Output Space for Semantic Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[32]  Xiaofeng Liu,et al.  Confidence Regularized Self-Training , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[33]  Zheng Zhang,et al.  Disentangled Non-Local Neural Networks , 2020, ECCV.

[34]  Roberto Cipolla,et al.  Fast-SCNN: Fast Semantic Segmentation Network , 2019, BMVC.

[35]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[36]  Xilin Chen,et al.  Object-Contextual Representations for Semantic Segmentation , 2019, ECCV.