Refign: Align and Refine for Adaptation of Semantic Segmentation to Adverse Conditions

Due to the scarcity of dense pixel-level semantic annotations for images recorded in adverse visual conditions, there has been a keen interest in unsupervised domain adaptation (UDA) for the semantic segmentation of such images. UDA adapts models trained on normal conditions to the target adverse-condition domains. Meanwhile, multiple datasets with driving scenes provide corresponding images of the same scenes across multiple conditions, which can serve as a form of weak supervision for domain adaptation. We propose Refign, a generic extension to self-training-based UDA methods which leverages these cross-domain correspondences. Refign consists of two steps: (1) aligning the normal-condition image to the corresponding adverse-condition image using an uncertainty-aware dense matching network, and (2) refining the adverse prediction with the normal prediction using an adaptive label correction mechanism. We design custom modules to streamline both steps and set the new state of the art for domain-adaptive semantic segmentation on several adverse-condition benchmarks, including ACDC and Dark Zurich. The approach introduces no extra training parameters, minimal computational overhead—during training only—and can be used as a drop-in extension to improve any given self-training-based UDA method. Code is available at https://github.com/brdav/refign.

[1]  Chia-Wen Lin,et al.  Unsupervised Foggy Scene Understanding via Self Spatial-Temporal Label Diffusion , 2022, IEEE Transactions on Image Processing.

[2]  Guoli Wang,et al.  Cross-Domain Correlation Distillation for Unsupervised Domain Adaptation in Nighttime Semantic Segmentation , 2022, Computer Vision and Pattern Recognition.

[3]  L. Gool,et al.  HRDA: Context-Aware High-Resolution Domain-Adaptive Semantic Segmentation , 2022, ECCV.

[4]  Chi Harold Liu,et al.  SePiCo: Semantic-Guided Pixel Contrast for Domain Adaptive Semantic Segmentation , 2022, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Suha Kwak,et al.  FIFO: Learning Fog-invariant Features for Foggy Scene Segmentation , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Angela P. Schoellig,et al.  Boreas: A multi-season autonomous driving dataset , 2022, Int. J. Robotics Res..

[7]  L. Gool,et al.  Probabilistic Warp Consistency for Weakly-Supervised Semantic Correspondences , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Song Wang,et al.  A One-Stage Domain Adaptation Network With Image Alignment for Unsupervised Nighttime Semantic Segmentation , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Dengxin Dai,et al.  Both Style and Fog Matter: Cumulative Domain Adaptation for Semantic Foggy Scene Understanding , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  L. Gool,et al.  DAFormer: Improving Network Architectures and Training Strategies for Domain-Adaptive Semantic Segmentation , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  C. Long,et al.  CDAda: A Curriculum Domain Adaptation for Nighttime Semantic Segmentation , 2021, 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW).

[12]  Luc Van Gool,et al.  PDC-Net+: Enhanced Probabilistic Dense Correspondence Network , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Anima Anandkumar,et al.  SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers , 2021, NeurIPS.

[14]  Luc Van Gool,et al.  ACDC: The Adverse Conditions Dataset with Correspondences for Semantic Driving Scene Understanding , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[15]  Song Wang,et al.  DANNet: A One-Stage Domain Adaptation Network for Unsupervised Nighttime Semantic Segmentation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Luc Van Gool,et al.  Warp Consistency for Unsupervised Learning of Dense Correspondences , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[17]  Andrea Tagliasacchi,et al.  COTR: Correspondence Transformer for Matching Across Images , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[18]  Yangdong Ye,et al.  Deep multi-view learning methods: A review , 2021, Neurocomputing.

[19]  Samy Bengio,et al.  Understanding deep learning (still) requires rethinking generalization , 2021, Commun. ACM.

[20]  Yong Wang,et al.  Prototypical Pseudo Label Denoising and Target Structure Learning for Domain Adaptive Semantic Segmentation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Colin Wei,et al.  Theoretical Analysis of Self-Training with Deep Networks on Unlabeled Data , 2020, ICLR.

[22]  Raoul de Charette,et al.  Rain Rendering for Evaluating and Improving Robustness to Bad Weather , 2020, International Journal of Computer Vision.

[23]  L. Svensson,et al.  DACS: Domain Adaptation via Cross-domain Mixed Sampling , 2020, 2021 IEEE Winter Conference on Applications of Computer Vision (WACV).

[24]  Xinghui Li,et al.  Dual-Resolution Correspondence Networks , 2020, NeurIPS.

[25]  Jianping Gou,et al.  Knowledge Distillation: A Survey , 2020, International Journal of Computer Vision.

[26]  Tao Wang,et al.  Revisiting Knowledge Distillation via Label Smoothing Regularization , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Luc Van Gool,et al.  Map-Guided Curriculum Domain Adaptation and Uncertainty-Aware Evaluation for Semantic Nighttime Image Segmentation , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  David A. Clifton,et al.  ProSelfLC: Progressive Self Label Correction for Training Robust Deep Neural Networks , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Stefano Soatto,et al.  FDA: Fourier Domain Adaptation for Semantic Segmentation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Alexei A. Efros,et al.  RANSAC-Flow: generic two-stage image alignment , 2020, ECCV.

[31]  Wen-mei W. Hwu,et al.  Differential Treatment for Stuff and Things: A Simple Unsupervised Domain Adaptation Method for Semantic Segmentation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Zhedong Zheng,et al.  Rectifying Pseudo Label Learning via Uncertainty Estimation for Domain Adaptive Semantic Segmentation , 2020, International Journal of Computer Vision.

[33]  Hyeran Byun,et al.  Learning Texture Invariant Representation for Domain Adaptation of Semantic Segmentation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Zhedong Zheng,et al.  Unsupervised Scene Adaptation with Memory Regularization in vivo , 2019, IJCAI.

[35]  Martin Danelljan,et al.  GLU-Net: Global-Local Universal Network for Dense Flow and Correspondences , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Xiaofeng Liu,et al.  Confidence Regularized Self-Training , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[37]  Lei Sun,et al.  See clearer at night: towards robust nighttime semantic segmentation through day-night image conversion , 2019, Security + Defence.

[38]  Carsten Rother,et al.  Benchmarking the Robustness of Semantic Segmentation Models with Respect to Common Corruptions , 2019, International Journal of Computer Vision.

[39]  Paul Newman,et al.  Don’t Worry About the Weather: Unsupervised Condition-Dependent Domain Adaptation , 2019, 2019 IEEE Intelligent Transportation Systems Conference (ITSC).

[40]  Kailun Yang,et al.  Bridging the Day and Night Domain Gap for Semantic Segmentation , 2019, 2019 IEEE Intelligent Vehicles Symposium (IV).

[41]  Jae-Gil Lee,et al.  SELFIE: Refurbishing Unclean Samples for Robust Deep Learning , 2019, ICML.

[42]  Nuno Vasconcelos,et al.  Bidirectional Learning for Domain Adaptation of Semantic Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Torsten Sattler,et al.  A Cross-Season Correspondence Dataset for Robust Semantic Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Luc Van Gool,et al.  Guided Curriculum Model Adaptation and Uncertainty-Aware Evaluation for Semantic Nighttime Image Segmentation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[45]  Yi-Hsuan Tsai,et al.  Domain Adaptation for Structured Output via Discriminative Patch Representations , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[46]  Dengxin Dai,et al.  Curriculum Model Adaptation with Synthetic and Real Data for Semantic Foggy Scene Understanding , 2019, International Journal of Computer Vision.

[47]  Patrick Pérez,et al.  ADVENT: Adversarial Entropy Minimization for Domain Adaptation in Semantic Segmentation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  Tomás Pajdla,et al.  Neighbourhood Consensus Networks , 2018, NeurIPS.

[49]  Torsten Sattler,et al.  DGC-Net: Dense Geometric Correspondence Network , 2018, 2019 IEEE Winter Conference on Applications of Computer Vision (WACV).

[50]  Luc Van Gool,et al.  Dark Model Adaptation: Semantic Image Segmentation from Daytime to Nighttime , 2018, 2018 21st International Conference on Intelligent Transportation Systems (ITSC).

[51]  Yi Yang,et al.  Taking a Closer Look at Domain Shift: Category-Level Adversaries for Semantics Consistent Domain Adaptation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  B. V. Vijaya Kumar,et al.  Unsupervised Domain Adaptation for Semantic Segmentation via Class-Balanced Self-training , 2018, ECCV.

[53]  Luc Van Gool,et al.  Model Adaptation with Synthetic and Real Data for Semantic Dense Foggy Scene Understanding , 2018, ECCV.

[54]  Trevor Darrell,et al.  BDD100K: A Diverse Driving Dataset for Heterogeneous Multitask Learning , 2018, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[55]  Dong Liu,et al.  Fully Convolutional Adaptation Networks for Semantic Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[56]  Zhengqi Li,et al.  MegaDepth: Learning Single-View Depth Prediction from Internet Photos , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[57]  Kiyoharu Aizawa,et al.  Joint Optimization Framework for Learning with Noisy Labels , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[58]  Ming-Hsuan Yang,et al.  Learning to Adapt Structured Output Space for Semantic Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[59]  Alex Bewley,et al.  Incremental Adversarial Domain Adaptation for Continually Changing Environments , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[60]  Swami Sankaranarayanan,et al.  Learning from Synthetic Data: Addressing Domain Shift for Semantic Segmentation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[61]  Frank Hutter,et al.  Decoupled Weight Decay Regularization , 2017, ICLR.

[62]  Taesung Park,et al.  CyCADA: Cycle-Consistent Adversarial Domain Adaptation , 2017, ICML.

[63]  Luc Van Gool,et al.  Semantic Foggy Scene Understanding with Synthetic Data , 2017, International Journal of Computer Vision.

[64]  Yoshua Bengio,et al.  A Closer Look at Memorization in Deep Networks , 2017, ICML.

[65]  Harri Valpola,et al.  Weight-averaged consistency targets improve semi-supervised deep learning results , 2017, ArXiv.

[66]  Paul Newman,et al.  1 year, 1000 km: The Oxford RobotCar dataset , 2017, Int. J. Robotics Res..

[67]  Trevor Darrell,et al.  FCNs in the Wild: Pixel-level Adversarial and Constraint-based Adaptation , 2016, ArXiv.

[68]  Xiaogang Wang,et al.  Pyramid Scene Parsing Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[69]  Jan-Michael Frahm,et al.  Structure-from-Motion Revisited , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[70]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[71]  Sebastian Ramos,et al.  The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[72]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[73]  Dumitru Erhan,et al.  Training Deep Neural Networks on Noisy Labels with Bootstrapping , 2014, ICLR.

[74]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[75]  Yuan Shi,et al.  Geodesic flow kernel for unsupervised domain adaptation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[76]  N. Painuly Section , 2012, British Journal of Psychiatry.

[77]  Rama Chellappa,et al.  Domain adaptation for object recognition: An unsupervised approach , 2011, 2011 International Conference on Computer Vision.

[78]  Takeo Kanade,et al.  Visual topometric localization , 2011, 2011 IEEE Intelligent Vehicles Symposium (IV).

[79]  Jason Weston,et al.  Curriculum learning , 2009, ICML '09.

[80]  David Yarowsky,et al.  Unsupervised Word Sense Disambiguation Rivaling Supervised Methods , 1995, ACL.

[81]  A Training details , 2021 .

[82]  A. Weigend,et al.  Estimating the mean and variance of the target probability distribution , 1994, Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN'94).

[83]  Frederick R. Forst,et al.  On robust estimation of the location parameter , 1980 .