Improving Anomaly Segmentation with Multi-Granularity Cross-Domain Alignment

Anomaly segmentation plays a crucial role in identifying anomalous objects within images, which facilitates the detection of road anomalies for autonomous driving. Although existing methods have shown impressive results in anomaly segmentation using synthetic training data, the domain discrepancies between synthetic training data and real test data are often neglected. To address this issue, the Multi-Granularity Cross-Domain Alignment (MGCDA) framework is proposed for anomaly segmentation in complex driving environments. It uniquely combines a new Multi-source Domain Adversarial Training (MDAT) module and a novel Cross-domain Anomaly-aware Contrastive Learning (CACL) method to boost the generality of the model, seamlessly integrating multi-domain data at both scene and sample levels. Multi-source domain adversarial loss and a dynamic label smoothing strategy are integrated into the MDAT module to facilitate the acquisition of domain-invariant features at the scene level, through adversarial training across multiple stages. CACL aligns sample-level representations with contrastive loss on cross-domain data, which utilizes an anomaly-aware sampling strategy to efficiently sample hard samples and anchors. The proposed framework has decent properties of parameter-free during the inference stage and is compatible with other anomaly segmentation networks. Experimental conducted on Fishyscapes and RoadAnomaly datasets demonstrate that the proposed framework achieves state-of-the-art performance.

[1]  Binghui Chen,et al.  DAMO-StreamNet: Optimizing Streaming Perception in Autonomous Driving , 2023, IJCAI.

[2]  Yifan Zhang,et al.  Free Lunch for Domain Adversarial Training: Environment Label Smoothing , 2023, ICLR.

[3]  Pengyu Li,et al.  Longshortnet: Exploring Temporal and Semantic Features Fusion In Streaming Perception , 2022, ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[4]  Xuansong Xie,et al.  Procontext: Exploring Progressive Context Transformer for Tracking , 2022, ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[5]  W. Li,et al.  Real-time Semantic Segmentation with Parallel Multiple Views Feature Augmentation , 2022, ACM Multimedia.

[6]  Wenguan Wang,et al.  GMMSeg: Gaussian Mixture based Generative Semantic Segmentation Models , 2022, NeurIPS.

[7]  A. Hauptmann,et al.  GSRFormer: Grounded Situation Recognition Transformer with Alternate Semantic Attention Refinement , 2022, ACM Multimedia.

[8]  Bumsub Ham,et al.  Bi-directional Contrastive Learning for Domain Adaptive Semantic Segmentation , 2022, ECCV.

[9]  Petra Bevandi'c,et al.  DenseHybrid: Hybrid Anomaly Detection for Dense Open-set Recognition , 2022, ECCV.

[10]  A. Hauptmann,et al.  Rethinking Spatial Invariance of Convolutional Networks for Object Counting , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Xiao Wu,et al.  SWNet: A Deep Learning Based Approach for Splashed Water Detection on Road , 2022, IEEE Transactions on Intelligent Transportation Systems.

[12]  Claudio S. Ravasio,et al.  Multi-scale and Cross-scale Contrastive Learning for Semantic Segmentation , 2022, ECCV.

[13]  G. Carneiro,et al.  Pixel-wise Energy-biased Abstention Learning for Anomaly Segmentation on Complex Urban Driving Scenes , 2021, ECCV.

[14]  J. Cui,et al.  Region-aware Contrastive Learning for Semantic Segmentation , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[15]  Jaegul Choo,et al.  Standardized Max Logits: A Simple yet Effective Approach for Identifying Unexpected Road Obstacles in Urban-Scene Segmentation , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[16]  Fei Yu,et al.  DAST: Unsupervised Domain Adaptation in Semantic Segmentation Based on Discriminator Attention and Self-Training , 2021, AAAI.

[17]  A. Hauptmann,et al.  Subspace Representation Learning for Few-shot Image Classification , 2021, ArXiv.

[18]  Roland Siegwart,et al.  Pixel-wise Anomaly Detection in Complex Driving Scenes , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  L. Gool,et al.  Exploring Cross-Image Pixel Contrast for Semantic Segmentation , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[20]  Gustavo Carneiro,et al.  Weakly-supervised Video Anomaly Detection with Robust Temporal Feature Magnitude Learning , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[21]  Ying Wu,et al.  Contrastive Learning for Label Efficient Semantic Segmentation , 2020, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[22]  Yixuan Li,et al.  Energy-based Out-of-distribution Detection , 2020, NeurIPS.

[23]  Gernot A. Fink,et al.  Detection and Retrieval of Out-of-Distribution Objects in Semantic Segmentation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[24]  Luc Van Gool,et al.  Revisiting Multi-Task Learning in the Deep Learning Era , 2020, ArXiv.

[25]  Ce Liu,et al.  Supervised Contrastive Learning , 2020, NeurIPS.

[26]  Geoffrey E. Hinton,et al.  A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.

[27]  Ross B. Girshick,et al.  Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Alexander Hauptmann,et al.  Learning Spatial Awareness to Improve Crowd Counting , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[29]  Roland Siegwart,et al.  This is not what I imagined: Error Detection for Semantic Segmentation through Visual Dissimilarity , 2019, ArXiv.

[30]  Marin Oršić,et al.  Simultaneous Semantic Segmentation and Outlier Detection in Presence of Domain Shift , 2019, GCPR.

[31]  Pascal Fua,et al.  Detecting the Unexpected via Image Resynthesis , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[32]  Roland Siegwart,et al.  The Fishyscapes Benchmark: Measuring Blind Spots in Semantic Segmentation , 2019, International Journal of Computer Vision.

[33]  Wei-Lun Chang,et al.  All About Structure: Adapting Structural Information Across Domains for Boosting Semantic Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Taesung Park,et al.  Semantic Image Synthesis With Spatially-Adaptive Normalization , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Jishnu Mukhoti,et al.  Evaluating Bayesian Deep Learning Methods for Semantic Segmentation , 2018, ArXiv.

[36]  Thomas G. Dietterich,et al.  Deep Anomaly Detection with Outlier Exposure , 2018, ICLR.

[37]  Marin Orsic,et al.  Discriminative out-of-distribution detection for semantic segmentation , 2018, ArXiv.

[38]  Kibok Lee,et al.  A Simple Unified Framework for Detecting Out-of-Distribution Samples and Adversarial Attacks , 2018, NeurIPS.

[39]  Stella X. Yu,et al.  Unsupervised Feature Learning via Non-parametric Instance Discrimination , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[40]  Nassir Navab,et al.  Deep Autoencoding Models for Unsupervised Anomaly Segmentation in Brain MR Images , 2018, BrainLes@MICCAI.

[41]  Mark J. F. Gales,et al.  Predictive Uncertainty Estimation via Prior Networks , 2018, NeurIPS.

[42]  Ming-Hsuan Yang,et al.  Learning to Adapt Structured Output Space for Semantic Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[43]  Kibok Lee,et al.  Training Confidence-calibrated Classifiers for Detecting Out-of-Distribution Samples , 2017, ICLR.

[44]  Alex Kendall,et al.  What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision? , 2017, NIPS.

[45]  Trevor Darrell,et al.  FCNs in the Wild: Pixel-level Adversarial and Constraint-based Adaptation , 2016, ArXiv.

[46]  Charles Blundell,et al.  Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles , 2016, NIPS.

[47]  Kevin Gimpel,et al.  A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks , 2016, ICLR.

[48]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[49]  Abhinav Gupta,et al.  Training Region-Based Object Detectors with Online Hard Example Mining , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[50]  Sebastian Ramos,et al.  The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  Paolo Favaro,et al.  Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles , 2016, ECCV.

[52]  Gregory Shakhnarovich,et al.  Learning Representations for Automatic Colorization , 2016, ECCV.

[53]  Trevor Darrell,et al.  Fully convolutional networks for semantic segmentation , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[54]  Victor S. Lempitsky,et al.  Unsupervised Domain Adaptation by Backpropagation , 2014, ICML.

[55]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[56]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[57]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[58]  Dawn Song,et al.  Scaling Out-of-Distribution Detection for Real-World Settings , 2022, ICML.