论文信息 - Weakly- and Semi-Supervised Object Detection with Expectation-Maximization Algorithm

Weakly- and Semi-Supervised Object Detection with Expectation-Maximization Algorithm

Object detection when provided image-level labels instead of instance-level labels (i.e., bounding boxes) during training is an important problem in computer vision, since large scale image datasets with instance-level labels are extremely costly to obtain. In this paper, we address this challenging problem by developing an Expectation-Maximization (EM) based object detection method using deep convolutional neural networks (CNNs). Our method is applicable to both the weakly-supervised and semi-supervised settings. Extensive experiments on PASCAL VOC 2007 benchmark show that (1) in the weakly supervised setting, our method provides significant detection performance improvement over current state-of-the-art methods, (2) having access to a small number of strongly (instance-level) annotated images, our method can almost match the performace of the fully supervised Fast RCNN. We share our source code at this https URL.

[1] Trevor Darrell,et al. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[2] T. Tuytelaars,et al. Weakly Supervised Object Detection with Posterior Regularization , 2014 .

[3] Tao Xiang,et al. In Defence of Negative Mining for Annotating Weakly Labelled Data , 2012, ECCV.

[4] Chong Wang,et al. Weakly Supervised Object Localization with Latent Category Learning , 2014, ECCV.

[5] Cordelia Schmid,et al. Multi-fold MIL Training for Weakly Supervised Object Localization , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[6] Cordelia Schmid,et al. TagProp: Discriminative metric learning in nearest neighbor models for image auto-annotation , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[7] C. Lawrence Zitnick,et al. Edge Boxes: Locating Object Proposals from Edges , 2014, ECCV.

[8] Trevor Darrell,et al. LSDA: Large Scale Detection through Adaptation , 2014, NIPS.

[9] Ming-Hsuan Yang,et al. Weakly Supervised Object Localization with Progressive Domain Adaptation , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12] Derek Hoiem,et al. Diagnosing Error in Object Detectors , 2012, ECCV.

[13] Andrea Vedaldi,et al. Weakly Supervised Deep Detection Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14] Jian Sun,et al. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15] Jianguo Zhang,et al. The PASCAL Visual Object Classes Challenge , 2006 .

[16] Zaïd Harchaoui,et al. On learning to localize objects with minimal supervision , 2014, ICML.

[17] Ivan Laptev,et al. ContextLocNet: Context-Aware Deep Network Models for Weakly Supervised Localization , 2016, ECCV.

[18] Tinne Tuytelaars,et al. Weakly supervised object detection with convex clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19] Wei Liu,et al. SSD: Single Shot MultiBox Detector , 2015, ECCV.

[20] P. Cochat,et al. Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[21] Yuxing Tang,et al. Large Scale Semi-Supervised Object Detection Using Visual and Semantic Knowledge Transfer , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22] Yong Dou,et al. Weakly supervised object detection using pseudo-strong labels , 2016, ArXiv.

[23] Simon Haykin,et al. GradientBased Learning Applied to Document Recognition , 2001 .

[24] Yong Jae Lee,et al. Weakly-supervised Discovery of Visual Pattern Configurations , 2014, NIPS.

[25] Thomas Deselaers,et al. Weakly Supervised Localization and Learning with Generic Knowledge , 2012, International Journal of Computer Vision.

[26] Trevor Darrell,et al. Detector discovery in the wild: Joint multiple instance and representation learning , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[28] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[29] Trevor Darrell,et al. Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[30] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[31] Ross B. Girshick,et al. Fast R-CNN , 2015, 1504.08083.

[32] Thomas Mensink,et al. Improving the Fisher Kernel for Large-Scale Image Classification , 2010, ECCV.

[33] Koen E. A. van de Sande,et al. Selective Search for Object Recognition , 2013, International Journal of Computer Vision.

[34] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[35] Ali Farhadi,et al. You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36] Luc Van Gool,et al. Weakly Supervised Cascaded Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).