论文信息 - Handling new target classes in semantic segmentation with domain adaptation

Handling new target classes in semantic segmentation with domain adaptation

Abstract In this work, we define and address a novel domain adaptation (DA) problem in semantic scene segmentation, where the target domain not only exhibits a data distribution shift w.r.t. the source domain, but also includes novel classes that do not exist in the latter. Different to “open-set” (Panareda Busto and Gall, 2017) and “universal domain adaptation” (You et al. 2019), which both regard all objects from new classes as “unknown”, we aim at explicit test-time prediction for these new classes. To reach this goal, we propose a framework that leverages domain adaptation and zero-shot learning techniques to enable “boundless” adaptation in the target domain. It relies on a novel architecture, along with a dedicated learning scheme, to bridge the source-target domain gap while learning how to map new classes’ labels to relevant visual representations. The performance is further improved using self-training on target-domain pseudo-labels. For validation, we consider different domain adaptation set-ups, namely synthetic-2-real, country-2-country and dataset-2-dataset. Our framework outperforms the baselines by significant margins, setting competitive standards on all benchmarks for the new task. Code and models are available at: https://github.com/valeoai/buda .

[1] Frédéric Jurie,et al. Improving Semantic Embedding Consistency by Metric Learning for Zero-Shot Classiffication , 2016, ECCV.

[2] Christoph H. Lampert,et al. Attribute-Based Classification for Zero-Shot Visual Object Categorization , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3] Léon Bottou,et al. Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.

[4] Andrew Y. Ng,et al. Zero-Shot Learning Through Cross-Modal Transfer , 2013, NIPS.

[5] Patrick Pérez,et al. ADVENT: Adversarial Entropy Minimization for Domain Adaptation in Semantic Segmentation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6] Bernt Schiele,et al. Evaluation of output embeddings for fine-grained image classification , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7] Juergen Gall,et al. Open Set Domain Adaptation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[8] Luc Van Gool,et al. Domain Adaptive Faster R-CNN for Object Detection in the Wild , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[9] Qingming Huang,et al. Unsupervised Open Domain Recognition by Semantic Discrepancy Minimization , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10] Bernt Schiele,et al. Zero-Shot Learning — The Good, the Bad and the Ugly , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11] Zhongfei Zhang,et al. Transductive Zero-Shot Learning With a Self-Training Dictionary Approach , 2017, IEEE Transactions on Cybernetics.

[12] Jiechao Guan,et al. Domain-Invariant Projection Learning for Zero-Shot Recognition , 2018, NeurIPS.

[13] Geoffrey French,et al. Self-ensembling for visual domain adaptation , 2017, ICLR.

[14] Nuno Vasconcelos,et al. Bidirectional Learning for Domain Adaptation of Semantic Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15] Jianmin Wang,et al. Transductive Zero-Shot Recognition via Shared Model Space Learning , 2016, AAAI.

[16] Philip David,et al. Domain Adaptation for Semantic Segmentation of Urban Scenes , 2017 .

[17] Ming-Hsuan Yang,et al. Learning to Adapt Structured Output Space for Semantic Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[18] Toshihiko Yamasaki,et al. Zero-Shot Semantic Segmentation via Variational Mapping , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[19] Jichang Guo,et al. Transductive Zero-Shot Learning With Adaptive Structural Embedding , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[20] Frédéric Jurie,et al. Generating Visual Representations for Zero-Shot Classification , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[21] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[22] Michael I. Jordan,et al. Learning Transferable Features with Deep Adaptation Networks , 2015, ICML.

[23] Trevor Darrell,et al. Generalized Zero- and Few-Shot Learning via Aligned Variational Autoencoders , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24] Yang Zou,et al. Domain Adaptation for Semantic Segmentation via Class-Balanced Self-Training , 2018, ArXiv.

[25] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.

[26] Venkatesh Saligrama,et al. Zero-Shot Learning via Semantic Similarity Embedding , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[27] Taesung Park,et al. CyCADA: Cycle-Consistent Adversarial Domain Adaptation , 2017, ICML.

[28] Trevor Darrell,et al. FCNs in the Wild: Pixel-level Adversarial and Constraint-based Adaptation , 2016, ArXiv.

[29] Bernt Schiele,et al. Learning Deep Representations of Fine-Grained Visual Descriptions , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30] Tatsuya Harada,et al. Open Set Domain Adaptation by Backpropagation , 2018, ECCV.

[31] Jing Zhang,et al. Importance Weighted Adversarial Nets for Partial Domain Adaptation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[32] Tatsuya Harada,et al. Maximum Classifier Discrepancy for Unsupervised Domain Adaptation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[33] Li Liu,et al. A Joint Generative Model for Zero-Shot Learning , 2018, ECCV Workshops.

[34] Bernt Schiele,et al. Feature Generating Networks for Zero-Shot Learning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[35] Matthieu Cord,et al. Zero-Shot Semantic Segmentation , 2019, NeurIPS.

[36] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37] Bernt Schiele,et al. Transfer Learning in a Transductive Setting , 2013, NIPS.

[38] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[39] C. V. Jawahar,et al. IDD: A Dataset for Exploring Problems of Autonomous Navigation in Unconstrained Environments , 2018, 2019 IEEE Winter Conference on Applications of Computer Vision (WACV).

[40] Shaogang Gong,et al. Transductive Multi-View Zero-Shot Learning , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41] Philip H. S. Torr,et al. An embarrassingly simple approach to zero-shot learning , 2015, ICML.

[42] Michael I. Jordan,et al. Universal Domain Adaptation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[43] Christoph H. Lampert,et al. Zero-Shot Learning—A Comprehensive Evaluation of the Good, the Bad and the Ugly , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[44] Shaogang Gong,et al. Unsupervised Domain Adaptation for Zero-Shot Learning , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[45] Shaogang Gong,et al. Semantic Autoencoder for Zero-Shot Learning , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[46] Jianmin Wang,et al. Partial Transfer Learning with Selective Adversarial Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[47] Jie Li,et al. Boosting Real-Time Driving Scene Parsing With Shared Semantics , 2020, IEEE Robotics and Automation Letters.

[48] Yang Liu,et al. Transductive Unbiased Embedding for Zero-Shot Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[49] Sebastian Ramos,et al. The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[50] Giovanni Maria Farinella,et al. SceneAdapt: Scene-based domain adaptation for semantic segmentation using adversarial learning , 2020, Pattern Recognit. Lett..

[51] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[52] George Papandreou,et al. Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation , 2018, ECCV.

[53] Antonio M. López,et al. The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[54] Kate Saenko,et al. Deep CORAL: Correlation Alignment for Deep Domain Adaptation , 2016, ECCV Workshops.

[55] Swami Sankaranarayanan,et al. Learning from Synthetic Data: Addressing Domain Shift for Semantic Segmentation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[56] Chuan Chen,et al. Learning Semantic Representations for Unsupervised Domain Adaptation , 2018, ICML.

[57] Piyush Rai,et al. Generalized Zero-Shot Learning via Synthesized Examples , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[58] Marc'Aurelio Ranzato,et al. DeViSE: A Deep Visual-Semantic Embedding Model , 2013, NIPS.

[59] Victor S. Lempitsky,et al. Unsupervised Domain Adaptation by Backpropagation , 2014, ICML.

[60] Luc Van Gool,et al. The Pascal Visual Object Classes Challenge: A Retrospective , 2014, International Journal of Computer Vision.

[61] Bernt Schiele,et al. Semantic Projection Network for Zero- and Few-Label Semantic Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).