Sequential Instance Refinement for Cross-Domain Object Detection in Images

Cross-domain object detection in images has attracted increasing attention in the past few years, which aims at adapting the detection model learned from existing labeled images (source domain) to newly collected unlabeled ones (target domain). Existing methods usually deal with the cross-domain object detection problem through direct feature alignment between the source and target domains at the image level, the instance level (i.e., region proposals) or both. However, we have observed that directly aligning features of all object instances from the two domains often results in the problem of negative transfer, due to the existence of (1) outlier target instances that contain confusing objects not belonging to any category of the source domain and thus are hard to be captured by detectors and (2) low-relevance source instances that are considerably statistically different from target instances although their contained objects are from the same category. With this in mind, we propose a reinforcement learning based method, coined as sequential instance refinement, where two agents are learned to progressively refine both source and target instances by taking sequential actions to remove both outlier target instances and low-relevance source instances step by step. Extensive experiments on several benchmark datasets demonstrate the superior performance of our method over existing state-of-the-art baselines for cross-domain object detection.

[1]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[2]  Zhaoxiang Zhang,et al.  Scale-Aware Trident Networks for Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[3]  Bernhard Schölkopf,et al.  Domain Adaptation with Conditional Transferable Components , 2016, ICML.

[4]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[5]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[6]  Fabio Maria Carlucci,et al.  AutoDIAL: Automatic Domain Alignment Layers , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[7]  Sebastian Ramos,et al.  The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Philip S. Yu,et al.  Transfer Feature Learning with Joint Distribution Adaptation , 2013, 2013 IEEE International Conference on Computer Vision.

[10]  Martin A. Fischler,et al.  The Representation and Matching of Pictorial Structures , 1973, IEEE Transactions on Computers.

[11]  Philip S. Yu,et al.  Transfer Joint Matching for Unsupervised Domain Adaptation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Bernhard Schölkopf,et al.  Correcting Sample Selection Bias by Unlabeled Data , 2006, NIPS.

[13]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Xiaogang Wang,et al.  GS3D: An Efficient 3D Object Detection Framework for Autonomous Driving , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Ling Shao,et al.  Hyperparameter Optimization for Tracking with Continuous Deep Q-Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[16]  Jiwen Lu,et al.  Collaborative Deep Reinforcement Learning for Multi-object Tracking , 2018, ECCV.

[17]  Bingbing Ni,et al.  Cross-Domain Detection via Graph-Induced Prototype Alignment , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Lixin Duan,et al.  Exploiting Images for Video Recognition: Heterogeneous Feature Augmentation via Symmetric Adversarial Learning , 2019, IEEE Transactions on Image Processing.

[19]  Victor S. Lempitsky,et al.  Unsupervised Domain Adaptation by Backpropagation , 2014, ICML.

[20]  Hanqing Lu,et al.  Attention CoupleNet: Fully Convolutional Attention Coupling Network for Object Detection , 2019, IEEE Transactions on Image Processing.

[21]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Jianfei Cai,et al.  An Exemplar-Based Multi-View Domain Generalization Framework for Visual Recognition , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[23]  Matthew Johnson-Roberson,et al.  Driving in the Matrix: Can virtual worlds replace human-generated annotations for real world tasks? , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[24]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[25]  Yiqiang Chen,et al.  Balanced Distribution Adaptation for Transfer Learning , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[26]  Trevor Darrell,et al.  Adversarial Discriminative Domain Adaptation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Kate Saenko,et al.  Strong-Weak Distribution Alignment for Adaptive Object Detection , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Adel M. Alimi,et al.  Efficient and Fast Objects Detection Technique for Intelligent Video Surveillance Using Transfer Learning and Fine-Tuning , 2020, Arabian Journal for Science and Engineering.

[29]  Jin Chen,et al.  Domain Adversarial Reinforcement Learning for Partial Domain Adaptation , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[30]  Xinge Zhu,et al.  Adapting Object Detectors via Selective Cross-Domain Alignment , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Jing Zhang,et al.  Joint Geometrical and Statistical Alignment for Visual Domain Adaptation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[33]  Dong Xu,et al.  Exploiting Low-Rank Structure from Latent Domains for Domain Generalization , 2014, ECCV.

[34]  Wen Li,et al.  Domain Generalization and Adaptation Using Low Rank Exemplar SVMs , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[36]  Long Ji Lin,et al.  Self-improving reactive agents based on reinforcement learning, planning and teaching , 1992, Machine Learning.

[37]  Raymond Y. K. Lau,et al.  Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[38]  Larry S. Davis,et al.  R-FCN-3000 at 30fps: Decoupling Detection and Classification , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[39]  Maneesh Singh,et al.  Progressive Domain Adaptation for Object Detection , 2019, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[40]  Xiangyang Li,et al.  Class Agnostic Image Common Object Detection , 2019, IEEE Transactions on Image Processing.

[41]  Dong Xu,et al.  Collaborative and Adversarial Network for Unsupervised Domain Adaptation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[42]  Yap-Peng Tan,et al.  Fall Incidents Detection for Intelligent Video Surveillance , 2005, 2005 5th International Conference on Information Communications & Signal Processing.

[43]  Hailin Jin,et al.  BAM! The Behance Artistic Media Dataset for Recognition Beyond Photography , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[44]  François Laviolette,et al.  Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..

[45]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[46]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[47]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[48]  Michael I. Jordan,et al.  Conditional Adversarial Domain Adaptation , 2017, NeurIPS.

[49]  Heesoo Myeong,et al.  SeedNet: Automatic Seed Generation with Deep Reinforcement Learning for Robust Interactive Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[50]  Jiaying Liu,et al.  Adaptive Batch Normalization for practical domain adaptation , 2018, Pattern Recognit..

[51]  Cristian Sminchisescu,et al.  Reinforcement Learning for Visual Object Detection , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Rama Chellappa,et al.  Domain adaptation for object recognition: An unsupervised approach , 2011, 2011 International Conference on Computer Vision.

[53]  Michael I. Jordan,et al.  Deep Transfer Learning with Joint Adaptation Networks , 2016, ICML.

[54]  Cristian Sminchisescu,et al.  Deep Reinforcement Learning of Region Proposal Networks for Object Detection , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[55]  Barbara Caputo,et al.  Boosting Domain Adaptation by Discovering Latent Domains , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[56]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[57]  Mingkui Tan,et al.  Domain-Symmetric Networks for Adversarial Domain Adaptation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[58]  Shaojie Shen,et al.  Stereo R-CNN Based 3D Object Detection for Autonomous Driving , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[59]  Liangliang Cao,et al.  Automatic Adaptation of Object Detectors to New Domains Using Self-Training , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[60]  Kiyoharu Aizawa,et al.  Cross-Domain Weakly-Supervised Object Detection Through Progressive Domain Adaptation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[61]  Arash Vahdat,et al.  A Robust Learning Approach to Domain Adaptive Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[62]  Chong-Wah Ngo,et al.  Exploring Object Relation in Mean Teacher for Cross-Domain Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[63]  MS-DIAL: Multi-Source Domain Alignment Layers for Unsupervised Domain Adaptation , 2020 .

[64]  Luc Van Gool,et al.  Domain Adaptive Faster R-CNN for Object Detection in the Wild , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[65]  Aaron Chadha,et al.  Improved Techniques for Adversarial Discriminative Domain Adaptation , 2020, IEEE Transactions on Image Processing.

[66]  Shuicheng Yan,et al.  Tree-Structured Reinforcement Learning for Sequential Object Localization , 2016, NIPS.

[67]  Svetlana Lazebnik,et al.  Active Object Localization with Deep Reinforcement Learning , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[68]  Michael I. Jordan,et al.  Learning Transferable Features with Deep Adaptation Networks , 2015, ICML.

[69]  Xin Wang,et al.  Video Captioning via Hierarchical Reinforcement Learning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[70]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[71]  Qi Tian,et al.  CenterNet: Keypoint Triplets for Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[72]  Bin Fang,et al.  Feature Pyramid Reconfiguration With Consistent Loss for Object Detection , 2019, IEEE Transactions on Image Processing.

[73]  Trevor Darrell,et al.  Deep Domain Confusion: Maximizing for Domain Invariance , 2014, CVPR 2014.

[74]  Yuan Shi,et al.  Geodesic flow kernel for unsupervised domain adaptation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[75]  Changick Kim,et al.  Self-Training and Adversarial Background Regularization for Unsupervised Domain Adaptive One-Stage Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[76]  Pascal Fua,et al.  Residual Parameter Transfer for Deep Domain Adaptation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[77]  Luc Van Gool,et al.  Semantic Foggy Scene Understanding with Synthetic Data , 2017, International Journal of Computer Vision.

[78]  Nannan Li,et al.  Meta Learning for Image Captioning , 2019, AAAI.