Fusing generic objectness and deformable part-based models for weakly supervised object detection

In the context of lack of object-level annotation, we propose a model that enhances the weakly supervised deformable part model (DPM) by emphasizing the importance of size and aspect ratio of the initial class-specific root filter. For each image, to extract a reliable bounding box as this root filter estimate, we explore the generic objectness measurement to obtain a reference window based on the most salient region, and select a small set of candidate windows by adaptive thresholding and greedy Non-Maximum Suppression (NMS). The initial root filter estimate is decided by optimizing the score of overlap between the reference box and candidate boxes, as well as their corresponding objectness score. Then the derived window is treated as a positive training window for DPM training. Finally, we design a flexible enlarging-and-shrinking post-processing procedure to modify the output of DPM, which can effectively fit to the aspect ratio of the object and further improve the final accuracy. Experimental results on the challenging PASCAL VOC 2007 database demonstrate that our proposed framework is effective and competitive with the state-of-the-arts.

[1]  Thomas Deselaers,et al.  Localizing Objects While Learning Their Appearance , 2010, ECCV.

[2]  Yan Ke,et al.  The Design of High-Level Features for Photo Quality Assessment , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[3]  Thomas Deselaers,et al.  Measuring the Objectness of Image Windows , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Svetlana Lazebnik,et al.  Scene recognition and weakly supervised object localization with deformable part-based models , 2011, 2011 International Conference on Computer Vision.

[5]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[6]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Tao Xiang,et al.  Weakly supervised object detector learning with model drift detection , 2011, 2011 International Conference on Computer Vision.

[8]  Thomas Deselaers,et al.  Weakly Supervised Localization and Learning with Generic Knowledge , 2012, International Journal of Computer Vision.

[9]  Tao Xiang,et al.  Transfer Learning by Ranking for Weakly Supervised Object Annotation , 2017, BMVC.

[10]  Tao Xiang,et al.  In Defence of Negative Mining for Annotating Weakly Labelled Data , 2012, ECCV.

[11]  David A. McAllester,et al.  Object Detection with Grammar Models , 2011, NIPS.

[12]  Daniel P. Huttenlocher,et al.  Weakly Supervised Learning of Part-Based Spatial Models for Visual Object Recognition , 2006, ECCV.

[13]  Carsten Rother,et al.  Weakly supervised discriminative localization and classification: a joint learning process , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[14]  Ivan Laptev,et al.  Object Detection Using Strongly-Supervised Deformable Part Models , 2012, ECCV.

[15]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.