论文信息 - Active Learning Strategies for Weakly-supervised Object Detection

Active Learning Strategies for Weakly-supervised Object Detection

. Object detectors trained with weak annotations are affordable alternatives to fully-supervised counterparts. However, there is still a significant performance gap between them. We propose to narrow this gap by fine-tuning a base pre-trained weakly-supervised detector with a few fully-annotated samples automatically selected from the training set using “box-in-box” (BiB), a novel active learning strategy designed specifically to address the well-documented failure modes of weakly-supervised detectors. Experiments on the VOC07 and COCO bench-marks show that BiB outperforms other active learning techniques and significantly improves the base weakly-supervised detector’s performance with only a few fully-annotated images per class. BiB reaches 97% of the performance of fully-supervised Fast RCNN with only 10% of fully-annotated images on VOC07. On COCO, using on average 10 fully-annotated images per class, or equivalently 1% of the training set, BiB also reduces the performance gap (in AP) between the weakly-supervised detector and the fully-supervised Fast RCNN by over 70%, showing a good trade-off between performance and data efficiency. Our code is pub-licly available at https://github.com/huyvvo/BiB .

[1] Ismail Elezi,et al. Not All Labels Are Equal: Rationalizing The Labeling Costs for Training Object Detection , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2] Jean Ponce,et al. Localizing Objects with Self-Supervised Transformers and no Labels , 2021, BMVC.

[3] Weijia Li,et al. Influence Selection for Active Learning , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[4] Siyu Huang,et al. Semi-Supervised Active Learning with Temporal Output Discrepancy , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[5] Xiang Bai,et al. End-to-End Semi-Supervised Object Detection with Soft Teacher , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[6] Cordelia Schmid,et al. Large-Scale Unsupervised Object Discovery , 2021, NeurIPS.

[7] Julien Mairal,et al. Emerging Properties in Self-Supervised Vision Transformers , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[8] Xiangyu Zhang,et al. Points as Queries: Weakly Semi-supervised Object Detection by Points , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9] Xiangyang Ji,et al. Multiple Instance Active Learning for Object Detection , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10] Hyuk-Jae Lee,et al. Active Learning for Deep Object Detection via Probabilistic Modeling , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[11] Chi Zhang,et al. FSCE: Few-Shot Object Detection via Contrastive Proposal Encoding , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12] Caiming Xiong,et al. Proposal Learning for Semi-Supervised Object Detection , 2020, 2021 IEEE Winter Conference on Applications of Computer Vision (WACV).

[13] Bernard Ghanem,et al. BAOD: Budget-Aware Object Detection , 2019, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[14] Trevor Darrell,et al. Minimax Active Learning , 2020, ArXiv.

[15] Steven McDonagh,et al. Many-shot from Low-shot: Learning to Annotate using Mixed Supervision for Object Detection , 2020, ECCV.

[16] Himanshu Arora,et al. Contextual Diversity for Active Learning , 2020, ECCV.

[17] Di Huang,et al. Improving Object Detection with Selective Self-supervised Self-training , 2020, ECCV.

[18] Jean Ponce,et al. Toward unsupervised, multi-object discovery in large-scale image collections , 2020, ECCV.

[19] Quoc V. Le,et al. Rethinking Pre-training and Self-training , 2020, NeurIPS.

[20] Han Zhang,et al. A Simple Semi-Supervised Learning Framework for Object Detection , 2020, ArXiv.

[21] Qingming Huang,et al. State-Relabeling Adversarial Active Learning , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22] Michele Fenzi,et al. Scalable Active Learning for Object Detection , 2020, 2020 IEEE Intelligent Vehicles Symposium (IV).

[23] Kaiming He,et al. Improved Baselines with Momentum Contrastive Learning , 2020, ArXiv.

[24] Zhili Liu,et al. EHSOD: CAM-Guided End-to-end Hybrid-Supervised Object Detection with Cascade Refinement , 2020, AAAI.

[25] Larry S. Davis,et al. Consistency-Based Semi-Supervised Active Learning: Towards Minimizing Labeling Cost , 2019, ECCV.

[26] Yu-Wing Tai,et al. Few-Shot Object Detection With Attention-RPN and Multi-Relation Detector , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27] John Langford,et al. Deep Batch Active Learning by Diverse, Uncertain Gradient Lower Bounds , 2019, ICLR.

[28] Wenyu Liu,et al. PCL: Proposal Cluster Learning for Weakly Supervised Object Detection , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29] Ross B. Girshick,et al. Mask R-CNN , 2017, 1703.06870.

[30] Dongrui Fan,et al. C-MIDN: Coupled Multiple Instance Detection Network With Segmentation Guidance for Weakly Supervised Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[31] Hongyang Chao,et al. WSOD2: Learning Bottom-Up and Top-Down Objectness Distillation for Weakly-Supervised Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[32] Wei Guo,et al. An Adaptive Supervision Framework for Active Learning in Object Detection , 2019, BMVC.

[33] Bin Wang,et al. Low Shot Box Correction for Weakly Supervised Object Detection , 2019, IJCAI.

[34] Shai Shalev-Shwartz,et al. Discriminative Active Learning , 2019, ArXiv.

[35] In So Kweon,et al. Learning Loss for Active Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[36] Chang Liu,et al. C-MIL: Continuation Multiple Instance Learning for Weakly Supervised Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[37] Patrick Pérez,et al. Unsupervised Image Matching and Object Discovery as Optimization , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[38] Trevor Darrell,et al. Variational Adversarial Active Learning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[39] Fedor Zhdanov,et al. Diverse mini-batch Active Learning , 2019, ArXiv.

[40] Xin Wang,et al. Few-Shot Object Detection via Feature Reweighting , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[41] C. V. Jawahar,et al. Dissimilarity Coefficient Based Weakly Supervised Object Detection , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[42] Joachim Denzler,et al. Active Learning for Deep Object Detection , 2018, VISIGRAPP.

[43] Sharath Pankanti,et al. RepMet: Representative-Based Metric Learning for Classification and Few-Shot Object Detection , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[44] Nojun Kwak,et al. Consistency-based Semi-supervised Learning for Object detection , 2019, NeurIPS.

[45] Jose M. Alvarez,et al. Large-Scale Visual Active Learning with Deep Probabilistic Ensembles , 2018, ArXiv.

[46] Andreas Nürnberger,et al. The Power of Ensembles for Active Learning in Image Classification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[47] Lei Zhang,et al. Towards Human-Machine Cooperation: Self-Supervised Sample Mining for Object Detection , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[48] Ming-Yu Liu,et al. Localization-Aware Active Learning for Object Detection , 2018, ACCV.

[49] Kaiming He,et al. Data Distillation: Towards Omni-Supervised Learning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[50] Silvio Savarese,et al. Active Learning for Convolutional Neural Networks: A Core-Set Approach , 2017, ICLR.

[51] Vinay P. Namboodiri,et al. Deep active learning for object detection , 2018, BMVC.

[52] Ran El-Yaniv,et al. Deep Active Learning over the Long Tail , 2017, ArXiv.

[53] Wei Liu,et al. Deep Self-Taught Learning for Weakly Supervised Object Localization , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[54] Wenyu Liu,et al. Multiple Instance Detection Network with Online Instance Classifier Refinement , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[55] Zoubin Ghahramani,et al. Deep Bayesian Active Learning with Image Data , 2017, ICML.

[56] Ali Farhadi,et al. YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[57] Kaiming He,et al. Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[58] Luc Van Gool,et al. Weakly Supervised Cascaded Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[59] Cordelia Schmid,et al. Weakly Supervised Object Localization with Multi-Fold Multiple Instance Learning , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[60] Andrea Vedaldi,et al. Weakly Supervised Deep Detection Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[61] Ali Farhadi,et al. You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[62] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[63] Nikos Komodakis,et al. Object Detection via a Multi-region and Semantic Segmentation-Aware CNN Model , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[64] Ross B. Girshick,et al. Fast R-CNN , 2015, 1504.08083.

[65] Cordelia Schmid,et al. Unsupervised object discovery and localization in the wild: Part-based matching with bottom-up region proposals , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[66] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[67] C. Lawrence Zitnick,et al. Edge Boxes: Locating Object Proposals from Edges , 2014, ECCV.

[68] Yong Jae Lee,et al. Weakly-supervised Discovery of Visual Pattern Configurations , 2014, NIPS.

[69] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.

[70] Zaïd Harchaoui,et al. On learning to localize objects with minimal supervision , 2014, ICML.

[71] Trevor Darrell,et al. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[72] Koen E. A. van de Sande,et al. Selective Search for Object Recognition , 2013, International Journal of Computer Vision.

[73] Daphne Koller,et al. Self-Paced Learning for Latent Variable Models , 2010, NIPS.

[74] Thomas Deselaers,et al. Localizing Objects While Learning Their Appearance , 2010, ECCV.

[75] Burr Settles,et al. Active Learning Literature Survey , 2009 .

[76] Jiayu Tang,et al. Non-negative matrix factorisation for object class discovery and image auto-annotation , 2008, CIVR '08.

[77] Sergei Vassilvitskii,et al. k-means++: the advantages of careful seeding , 2007, SODA '07.

[78] Luc Van Gool,et al. The 2005 PASCAL Visual Object Classes Challenge , 2005, MLCW.

[79] Jianguo Zhang,et al. The PASCAL Visual Object Classes Challenge , 2006 .

[80] Alexei A. Efros,et al. Discovering objects and their location in images , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[81] Thomas G. Dietterich,et al. Solving the Multiple Instance Problem with Axis-Parallel Rectangles , 1997, Artif. Intell..