R-SSL: Region based Semi-Supervised Learning for Sparsely Annotated Object Detection

Training with sparse annotations is known to reduce the performance of object detectors. Previous methods have focused on proxies for missing ground truth annotations either in the form of pseudo-labels or by reweighing gradients for unlabeled boxes. In contrast, we formulate the problem of sparsely annotated object detection as semi-supervised learning at a region level. Next, we propose an end-to-end system that learns to separate the proposals into labeled and unlabeled regions. The labeled and unlabeled regions are then processed differently, similar to semi-supervised learning, thereby reducing the negative effect of missing annotations. This novel approach has multiple advantages like improved robustness to higher sparsity when compared to existing methods. We conduct exhaustive experiments on five splits on the PASCAL-VOC and COCO datasets achiev-ing state-of-the-art performance. On average, we improve by 2 . 6 , 3 . 9 and 9 . 6 mAP over previous state-of-the-art methods on three splits of increasing sparsity on COCO.

[1]  Ke Yan,et al.  SIOD: Single Instance Annotated Per Category Per Image for Object Detection , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Zejia Weng,et al.  Semi-Supervised Vision Transformers , 2021, ECCV.

[3]  Xiang Bai,et al.  End-to-End Semi-Supervised Object Detection with Soft Teacher , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[4]  Biao Wang,et al.  Interactive Self-Training with Mean Teachers for Semi-supervised Object Detection , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Rama Chellappa,et al.  The Pursuit of Knowledge: Discovering and Localizing Novel Categories using Dual Memory , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[6]  Peter Vajda,et al.  Unbiased Teacher for Semi-Supervised Object Detection , 2021, ICLR.

[7]  Byoungjip Kim,et al.  SelfMatch: Combining Contrastive Self-Supervision and Consistency for Semi-Supervised Learning , 2021, ArXiv.

[8]  Xiangyu Zhang,et al.  Co-mining: Self-Supervised Learning for Sparsely Annotated Object Detection , 2020, AAAI.

[9]  Xinlei Chen,et al.  Exploring Simple Siamese Representation Learning , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Adam P. Harrison,et al.  Learning From Multiple Datasets With Heterogeneous and Partial Labels for Universal Lesion Detection in CT , 2020, IEEE Transactions on Medical Imaging.

[11]  Seungbum Hong,et al.  Semi-Supervised Object Detection With Sparsely Annotated Dataset , 2020, 2021 IEEE International Conference on Image Processing (ICIP).

[12]  Quoc V. Le,et al.  Meta Pseudo Labels , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Caiming Xiong,et al.  Proposal Learning for Semi-Supervised Object Detection , 2020, 2021 IEEE Winter Conference on Applications of Computer Vision (WACV).

[14]  Xin Han,et al.  A Novel Loss Calibration Strategy for Object Detection Networks Training on Sparsely Annotated Pathological Datasets , 2020, MICCAI.

[15]  Jean Ponce,et al.  Toward unsupervised, multi-object discovery in large-scale image collections , 2020, ECCV.

[16]  Geoffrey E. Hinton,et al.  Big Self-Supervised Models are Strong Semi-Supervised Learners , 2020, NeurIPS.

[17]  Julien Mairal,et al.  Unsupervised Learning of Visual Features by Contrasting Cluster Assignments , 2020, NeurIPS.

[18]  Pierre H. Richemond,et al.  Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning , 2020, NeurIPS.

[19]  Adam P. Harrison,et al.  Universal Lesion Detection by Learning from Multiple Heterogeneously Labeled Datasets , 2020, ArXiv.

[20]  Han Zhang,et al.  A Simple Semi-Supervised Learning Framework for Object Detection , 2020, ArXiv.

[21]  David Berthelot,et al.  ReMixMatch: Semi-Supervised Learning with Distribution Matching and Augmentation Anchoring , 2020, ICLR.

[22]  Bernard Ghanem,et al.  Beyond Weakly Supervised: Pseudo Ground Truths Mining for Missing Bounding-Boxes Object Detection , 2020, IEEE Transactions on Circuits and Systems for Video Technology.

[23]  Terrance E. Boult,et al.  The Overlooked Elephant of Object Detection: Open Set , 2020, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[24]  Geoffrey E. Hinton,et al.  A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.

[25]  Zhiqiang Shen,et al.  Solving Missing-Annotation Object Detection with Background Recalibration Loss , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[26]  Lawrence Carin,et al.  Object Detection as a Positive-Unlabeled Problem , 2020, BMVC.

[27]  David Berthelot,et al.  FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence , 2020, NeurIPS.

[28]  Ross B. Girshick,et al.  Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Jordi Pont-Tuset,et al.  The Open Images Dataset V4 , 2018, International Journal of Computer Vision.

[30]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[31]  David Berthelot,et al.  MixMatch: A Holistic Approach to Semi-Supervised Learning , 2019, NeurIPS.

[32]  Yannis Avrithis,et al.  Label Propagation for Deep Semi-Supervised Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Patrick Pérez,et al.  Unsupervised Image Matching and Object Discovery as Optimization , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Takuya Akiba,et al.  Sampling Techniques for Large-Scale Object Detection From Sparsely Annotated Objects , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Larry S. Davis,et al.  Soft Sampling for Robust Object Detection , 2018, BMVC.

[36]  Nojun Kwak,et al.  Consistency-based Semi-supervised Learning for Object detection , 2019, NeurIPS.

[37]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Wei Liu,et al.  Multi-Modal Curriculum Learning for Semi-Supervised Image Classification , 2016, IEEE Transactions on Image Processing.

[39]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[42]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[43]  Abhinav Gupta,et al.  Constrained Semi-Supervised Learning Using Attributes and Comparative Attributes , 2012, ECCV.

[44]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[45]  Jianguo Zhang,et al.  The PASCAL Visual Object Classes Challenge , 2006 .

[46]  Luc Van Gool,et al.  The 2005 PASCAL Visual Object Classes Challenge , 2005, MLCW.

[47]  Xiaojin Zhu,et al.  --1 CONTENTS , 2006 .

[48]  Percy Liang,et al.  Semi-Supervised Learning for Natural Language , 2005 .