ADeLA: Automatic Dense Labeling with Attention for Viewpoint Adaptation in Semantic Segmentation

We describe an unsupervised domain adaptation method for image content shift caused by viewpoint changes for a semantic segmentation task. Most existing methods perform domain alignment in a shared space and assume that the mapping from the aligned space to the output is transferable. However, the novel content induced by viewpoint changes may nullify such a space for effective alignments, thus resulting in negative adaptation. Our method works without aligning any statistics of the images between the two domains. Instead, it utilizes a view transformation network trained only on color images to hallucinate the semantic images for the target. Despite the lack of supervision, the view transformation network can still generalize to semantic images thanks to the inductive bias introduced by the attention mechanism. Furthermore, to resolve ambiguities in converting the semantic images to semantic labels, we treat the view transformation network as a functional representation of an unknown mapping implied by the color images and propose functional label hallucination to generate pseudo-labels in the target domain. Our method surpasses baselines built on state-of-the-art correspondence estimation and view synthesis methods. Moreover, it outperforms the state-of-the-art unsupervised domain adaptation methods that utilize self-training and adversarial domain alignment. Our code and dataset will be made publicly available.

[1]  Yong Wang,et al.  Prototypical Pseudo Label Denoising and Target Structure Learning for Domain Adaptive Semantic Segmentation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Gabriela Csurka,et al.  Deep Visual Domain Adaptation , 2020, 2020 22nd International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC).

[3]  Jiahua Dong,et al.  CSCL: Critical Semantic-Consistent Learning for Unsupervised Domain Adaptation , 2020, ECCV.

[4]  Hongyang Chao,et al.  Learning Joint Spatial-Temporal Transformations for Video Inpainting , 2020, ECCV.

[5]  Xiaobing Zhang,et al.  Contextual-Relation Consistent Domain Adaptation for Semantic Segmentation , 2020, ECCV.

[6]  Giovanni Maria Farinella,et al.  SceneAdapt: Scene-based domain adaptation for semantic segmentation using adversarial learning , 2020, Pattern Recognit. Lett..

[7]  Noah Snavely,et al.  Single-View View Synthesis With Multiplane Images , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  In So Kweon,et al.  Unsupervised Intra-Domain Adaptation for Semantic Segmentation Through Self-Supervision , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Stefano Soatto,et al.  FDA: Fourier Domain Adaptation for Semantic Segmentation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Stefano Soatto,et al.  Phase Consistent Ecological Domain Adaptation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Jia Deng,et al.  RAFT: Recurrent All-Pairs Field Transforms for Optical Flow , 2020, ECCV.

[12]  Yan Xu,et al.  MaskFlownet: Asymmetric Feature Matching With Learnable Occlusion Mask , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Wen-mei W. Hwu,et al.  Differential Treatment for Stuff and Things: A Simple Unsupervised Domain Adaptation Method for Semantic Segmentation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Geoffrey J. Gordon,et al.  Domain Adaptation with Conditional Distribution Matching and Generalized Label Shift , 2020, NeurIPS.

[15]  Hyeran Byun,et al.  Learning Texture Invariant Representation for Domain Adaptation of Semantic Segmentation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  R. Szeliski,et al.  SynSin: End-to-End View Synthesis From a Single Image , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Dacheng Tao,et al.  Category Anchor-Guided Unsupervised Domain Adaptation for Semantic Segmentation , 2019, NeurIPS.

[18]  Jingang Tan,et al.  SSF-DAN: Separated Semantic Feature Based Domain Adaptation Network for Semantic Segmentation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[19]  Alexandru Paul Condurache,et al.  NoVA: Learning to See in Novel Viewpoints and Domains , 2019, 2019 International Conference on 3D Vision (3DV).

[20]  Gordon Wetzstein,et al.  Scene Representation Networks: Continuous 3D-Structure-Aware Neural Scene Representations , 2019, NeurIPS.

[21]  Paul Debevec,et al.  DeepView: View Synthesis With Learned Gradient Descent , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Nuno Vasconcelos,et al.  Bidirectional Learning for Domain Adaptation of Semantic Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Jitendra Malik,et al.  Habitat: A Platform for Embodied AI Research , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[24]  Junqing Yu,et al.  Significance-Aware Information Bottleneck for Domain Adaptive Semantic Segmentation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[25]  Wei-Lun Chang,et al.  All About Structure: Adapting Structural Information Across Domains for Boosting Semantic Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Yi-Hsuan Tsai,et al.  Domain Adaptation for Structured Output via Discriminative Patch Representations , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[27]  L. Gool,et al.  DLOW: Domain Flow for Adaptation and Generalization , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Jan Kautz,et al.  Extreme View Synthesis , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[29]  Patrick Pérez,et al.  ADVENT: Adversarial Entropy Minimization for Domain Adaptation in Semantic Segmentation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Yunchao Wei,et al.  Self-Similarity Grouping: A Simple Unsupervised Cross Domain Adaptation Approach for Person Re-Identification , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[31]  Rogério Schmidt Feris,et al.  Co-regularized Alignment for Unsupervised Domain Adaptation , 2018, NeurIPS.

[32]  Yi Yang,et al.  Taking a Closer Look at Domain Shift: Category-Level Adversaries for Semantics Consistent Domain Adaptation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Kate Saenko,et al.  Syn2Real: A New Benchmark forSynthetic-to-Real Visual Domain Adaptation , 2018, ArXiv.

[34]  John Flynn,et al.  Stereo magnification , 2018, ACM Trans. Graph..

[35]  Slawomir Bak,et al.  Domain Adaptation through Synthesis for Unsupervised Person Re-identification , 2018, ECCV.

[36]  Larry S. Davis,et al.  DCAN: Dual Channel-wise Alignment Networks for Unsupervised Scene Adaptation , 2018, ECCV.

[37]  Stefano Ermon,et al.  A DIRT-T Approach to Unsupervised Domain Adaptation , 2018, ICLR.

[38]  Mei Wang,et al.  Deep Visual Domain Adaptation: A Survey , 2018, Neurocomputing.

[39]  Yi Yang,et al.  Image-Image Domain Adaptation with Preserved Self-Similarity and Domain-Dissimilarity for Person Re-identification , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[40]  Swami Sankaranarayanan,et al.  Learning from Synthetic Data: Addressing Domain Shift for Semantic Segmentation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[41]  Taesung Park,et al.  CyCADA: Cycle-Consistent Adversarial Domain Adaptation , 2017, ICML.

[42]  Kate Saenko,et al.  VisDA: The Visual Domain Adaptation Challenge , 2017, ArXiv.

[43]  Donald A. Adjeroh,et al.  Unified Deep Supervised Domain Adaptation and Generalization , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[44]  Matthias Nießner,et al.  Matterport3D: Learning from RGB-D Data in Indoor Environments , 2017, 2017 International Conference on 3D Vision (3DV).

[45]  Bolei Zhou,et al.  Scene Parsing through ADE20K Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[47]  Ping Tan,et al.  DualGAN: Unsupervised Dual Learning for Image-to-Image Translation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[48]  Nanning Zheng,et al.  View Adaptive Recurrent Neural Networks for High Performance Human Action Recognition from Skeleton Data , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[49]  Jan Kautz,et al.  Unsupervised Image-to-Image Translation Networks , 2017, NIPS.

[50]  Tatsuya Harada,et al.  Asymmetric Tri-training for Unsupervised Domain Adaptation , 2017, ICML.

[51]  Trevor Darrell,et al.  Adversarial Discriminative Domain Adaptation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Edwin Lughofer,et al.  Central Moment Discrepancy (CMD) for Domain-Invariant Representation Learning , 2017, ICLR.

[53]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[54]  Jitendra Malik,et al.  View Synthesis by Appearance Flow , 2016, ECCV.

[55]  Mubarak Shah,et al.  Learning a Deep Model for Human Action Recognition from Novel Viewpoints , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[56]  Jianxiong Xiao,et al.  SUN RGB-D: A RGB-D scene understanding benchmark suite , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[57]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[58]  Rama Chellappa,et al.  Visual Domain Adaptation: A survey of recent advances , 2015, IEEE Signal Processing Magazine.

[59]  Michael I. Jordan,et al.  Learning Transferable Features with Deep Adaptation Networks , 2015, ICML.

[60]  Tinne Tuytelaars,et al.  Unsupervised Visual Domain Adaptation Using Subspace Alignment , 2013, 2013 IEEE International Conference on Computer Vision.

[61]  Brian C. Lovell,et al.  Unsupervised Domain Adaptation by Domain Invariant Projection , 2013, 2013 IEEE International Conference on Computer Vision.

[62]  Nathan Silberman,et al.  Indoor scene segmentation using a structured light sensor , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[63]  Bo Geng,et al.  DAML: Domain Adaptation Metric Learning , 2011, IEEE Transactions on Image Processing.

[64]  Yoshua Bengio,et al.  Domain Adaptation for Large-Scale Sentiment Classification: A Deep Learning Approach , 2011, ICML.

[65]  Trevor Darrell,et al.  Adapting Visual Category Models to New Domains , 2010, ECCV.

[66]  Gabriela Csurka,et al.  A Comprehensive Survey on Domain Adaptation for Visual Applications , 2017, Domain Adaptation in Computer Vision Applications.

[67]  Philip David,et al.  Domain Adaptation for Semantic Segmentation of Urban Scenes , 2017 .

[68]  Silvio Savarese,et al.  Learning Transferrable Representations for Unsupervised Domain Adaptation , 2016, NIPS.