Self-Guided Adaptation: Progressive Representation Alignment for Domain Adaptive Object Detection

Unsupervised domain adaptation (UDA) has achieved unprecedented success in improving the cross-domain robustness of object detection models. However, existing UDA methods largely ignore the instantaneous data distribution during model learning, which could deteriorate the feature representation given large domain shift. In this work, we propose a Self-Guided Adaptation (SGA) model, target at aligning feature representation and transferring object detection models across domains while considering the instantaneous alignment difficulty. The core of SGA is to calculate "hardness" factors for sample pairs indicating domain distance in a kernel space. With the hardness factor, the proposed SGA adaptively indicates the importance of samples and assigns them different constrains. Indicated by hardness factors, Self-Guided Progressive Sampling (SPS) is implemented in an "easy-to-hard" way during model adaptation. Using multi-stage convolutional features, SGA is further aggregated to fully align hierarchical representations of detection models. Extensive experiments on commonly used benchmarks show that SGA improves the state-of-the-art methods with significant margins, while demonstrating the effectiveness on large domain shift.

[1]  Bernhard Schölkopf,et al.  A Kernel Two-Sample Test , 2012, J. Mach. Learn. Res..

[2]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[3]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Michael I. Jordan,et al.  Unsupervised Domain Adaptation with Residual Transfer Networks , 2016, NIPS.

[5]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[6]  Chong-Wah Ngo,et al.  Exploring Object Relation in Mean Teacher for Cross-Domain Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Hailin Jin,et al.  BAM! The Behance Artistic Media Dataset for Recognition Beyond Photography , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[8]  Taesung Park,et al.  CyCADA: Cycle-Consistent Adversarial Domain Adaptation , 2017, ICML.

[9]  Juergen Gall,et al.  Open Set Domain Adaptation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[10]  Junzhou Huang,et al.  Progressive Feature Alignment for Unsupervised Domain Adaptation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Tiejun Huang,et al.  Cross-Domain Adversarial Feature Learning for Sketch Re-identification , 2018, ACM Multimedia.

[12]  Le Song,et al.  A unified kernel framework for nonparametric inference in graphical models ] Kernel Embeddings of Conditional Distributions , 2013 .

[13]  Michael I. Jordan,et al.  Learning Transferable Features with Deep Adaptation Networks , 2015, ICML.

[14]  Hans-Peter Kriegel,et al.  Integrating structured biological data by Kernel Maximum Mean Discrepancy , 2006, ISMB.

[15]  Sumit Chopra,et al.  DLID: Deep Learning for Domain Adaptation by Interpolating between Domains , 2013 .

[16]  François Laviolette,et al.  Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..

[17]  Luc Van Gool,et al.  Domain Adaptive Faster R-CNN for Object Detection in the Wild , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[18]  Changick Kim,et al.  Self-Training and Adversarial Background Regularization for Unsupervised Domain Adaptive One-Stage Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[19]  Nuno Vasconcelos,et al.  Towards Universal Object Detection by Domain Attention , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Sebastian Ramos,et al.  The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[22]  Kiyoharu Aizawa,et al.  Cross-Domain Weakly-Supervised Object Detection Through Progressive Domain Adaptation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[23]  Andreas Geiger,et al.  Vision meets robotics: The KITTI dataset , 2013, Int. J. Robotics Res..

[24]  Maneesh Singh,et al.  Progressive Domain Adaptation for Object Detection , 2019, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[25]  Victor S. Lempitsky,et al.  Unsupervised Domain Adaptation by Backpropagation , 2014, ICML.

[26]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[27]  Kate Saenko,et al.  Strong-Weak Distribution Alignment for Adaptive Object Detection , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Luc Van Gool,et al.  Semantic Foggy Scene Understanding with Synthetic Data , 2017, International Journal of Computer Vision.

[29]  Kaiming He,et al.  Focal Loss for Dense Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[30]  Alexei A. Efros,et al.  Unbiased look at dataset bias , 2011, CVPR 2011.

[31]  Xinge Zhu,et al.  Adapting Object Detectors via Selective Cross-Domain Alignment , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Tatsuya Harada,et al.  Maximum Classifier Discrepancy for Unsupervised Domain Adaptation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[34]  Donald A. Adjeroh,et al.  Unified Deep Supervised Domain Adaptation and Generalization , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[35]  Zhiming Luo,et al.  Invariance Matters: Exemplar Memory for Domain Adaptive Person Re-Identification , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Ming-Yu Liu,et al.  Coupled Generative Adversarial Networks , 2016, NIPS.

[37]  Wen Li,et al.  Domain Generalization and Adaptation Using Low Rank Exemplar SVMs , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Changick Kim,et al.  Diversify and Match: A Domain Adaptive Representation Learning Paradigm for Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Shiguang Shan,et al.  Self-Paced Curriculum Learning , 2015, AAAI.

[40]  Yi Yang,et al.  Image-Image Domain Adaptation with Preserved Self-Similarity and Domain-Dissimilarity for Person Re-identification , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[41]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[42]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[43]  Ivor W. Tsang,et al.  Domain Adaptation via Transfer Component Analysis , 2009, IEEE Transactions on Neural Networks.

[44]  François Laviolette,et al.  Domain-Adversarial Neural Networks , 2014, ArXiv.

[45]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[46]  Trevor Darrell,et al.  Simultaneous Deep Transfer Across Domains and Tasks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[47]  Kiyoung Moon,et al.  UA-DETRAC 2018: Report of AVSS2018 & IWT4S Challenge on Advanced Traffic Monitoring , 2018, 2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[48]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).