Few-Shot Object Detection via Classification Refinement and Distractor Retreatment

We aim to tackle the challenging Few-Shot Object Detection (FSOD), where data-scarce categories are presented during the model learning. The failure modes of FasterRCNN in FSOD are investigated, and we find that the performance degradation is mainly due to the classification incapability (false positives) caused by category confusion, which motivates us to address FSOD from a novel aspect of classification refinement. Specifically, we address the intrinsic limitation from the aspects of both architectural enhancement and hard-example mining. We introduce a novel few-shot classification refinement mechanism where a decoupled Few-Shot Classification Network (FSCN) is employed to improve the final classification of a base detector. Moreover, we especially probe a commonly-overlooked but destructive issue of FSOD, i.e., the presence of distractor samples due to the incomplete annotations where images from the base set may contain novel-class objects but remain unlabelled. Retreatment solutions are developed to eliminate the incurred false positives. For FSCN training, the distractor is formulated as a semi-supervised problem, where a distractor utilization loss is proposed to make proper use of it for boosting the data-scarce classes, while a confidence-guided dataset pruning (CGDP) technique is developed to facilitate the few-shot adaptation of base detector. Experiments demonstrate that our proposed framework achieves state-of-the-art FSOD performance on public datasets, e.g., Pascal VOC and MS-COCO.

[1]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Yali Wang,et al.  Context-Transformer: Tackling Object Confusion for Few-Shot Detection , 2020, AAAI.

[3]  Hong-Yuan Mark Liao,et al.  YOLOv4: Optimal Speed and Accuracy of Object Detection , 2020, ArXiv.

[4]  Joost van de Weijer,et al.  Active Learning for Deep Detection Neural Networks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[5]  Di Huang,et al.  Multi-Scale Positive Sample Refinement for Few-Shot Object Detection , 2020, ECCV.

[6]  Wenyu Liu,et al.  Multiple Instance Detection Network with Online Instance Classifier Refinement , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Sharath Pankanti,et al.  RepMet: Representative-Based Metric Learning for Classification and Few-Shot Object Detection , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Matthew A. Brown,et al.  Low-Shot Learning with Imprinted Weights , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[10]  Wenyu Liu,et al.  PCL: Proposal Cluster Learning for Weakly Supervised Object Detection , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Hao Chen,et al.  LSTD: A Low-Shot Transfer Detector for Object Detection , 2018, AAAI.

[12]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[13]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Trevor Darrell,et al.  Frustratingly Simple Few-Shot Object Detection , 2020, ICML.

[15]  Chek Sing Teo,et al.  Weight Imprinting Classification-Based Force Grasping With a Variable-Stiffness Robotic Gripper , 2022, IEEE Transactions on Automation Science and Engineering.

[16]  Jinjun Xiong,et al.  Revisiting RCNN: On Awakening the Classification Power of Faster RCNN , 2018, ECCV.

[17]  Xin Wang,et al.  Few-Shot Object Detection via Feature Reweighting , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[18]  Tao Xiang,et al.  Incremental Few-Shot Object Detection , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  C. Lawrence Zitnick,et al.  Edge Boxes: Locating Object Proposals from Edges , 2014, ECCV.

[20]  Xing Ji,et al.  CosFace: Large Margin Cosine Loss for Deep Face Recognition , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[21]  Errui Ding,et al.  Compact Generalized Non-local Network , 2018, NeurIPS.

[22]  Xiaodan Liang,et al.  Meta R-CNN: Towards General Solver for Instance-Level Low-Shot Learning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[23]  Chek Sing Teo,et al.  Grasping Detection Network with Uncertainty Estimation for Confidence-Driven Semi-Supervised Domain Adaptation , 2020, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).