Assemble New Object Detector With Few Examples

Learning a satisfactory object detector generally requires sufficient training data to cover the most variations of the object. In this paper, we show that the performance of object detector is severely degraded when training examples are limited. We propose an approach to handle this issue by exploring a set of pretrained auxiliary detectors for other categories. By mining the global and local relationships between the target object category and auxiliary objects, a robust detector can be learned with very few training examples. We adopt the deformable part model proposed by Felzenszwalb and simultaneously explore the root and part filters in the auxiliary object detectors under the guidance of the few training examples from the target object category. An iterative solution is introduced for such a process. The extensive experiments on the PASCAL VOC 2007 challenge data set show the encouraging performance of the new detector assembled from those related auxiliary detectors.

[1]  Jun Yang,et al.  A framework for classifier adaptation and its applications in concept detection , 2008, MIR '08.

[2]  Alexander Zien,et al.  Semi-Supervised Learning , 2006 .

[3]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Yali Amit,et al.  A coarse-to-fine strategy for multiclass shape detection , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Ivor W. Tsang,et al.  Domain adaptation from multiple sources via auxiliary classifiers , 2009, ICML '09.

[6]  Andrew Zisserman,et al.  Learning an Alphabet of Shape and Appearance for Multi-Class Object Detection , 2008, International Journal of Computer Vision.

[7]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[8]  Meng Wang,et al.  Unified Video Annotation via Multigraph Learning , 2009, IEEE Transactions on Circuits and Systems for Video Technology.

[9]  Sanja Fidler,et al.  Similarity-based cross-layered hierarchical representation for object categorization , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Rohini K. Srihari,et al.  Incorporating prior knowledge with weighted margin support vector machines , 2004, KDD.

[11]  Meng Wang,et al.  Beyond Distance Measurement: Constructing Neighborhood Similarity for Video Annotation , 2009, IEEE Transactions on Multimedia.

[12]  Stephen J. Wright,et al.  Numerical Optimization , 2018, Fundamental Statistical Inference.

[13]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[14]  Yair Weiss,et al.  Learning object detection from a small number of examples: the importance of good features , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[15]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[16]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[17]  Zhengyou Zhang,et al.  Taylor expansion based classifier adaptation: Application to person detection , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Glenn Fung,et al.  A Feature Selection Newton Method for Support Vector Machine Classification , 2004, Comput. Optim. Appl..

[19]  Antonio Torralba,et al.  LabelMe: A Database and Web-Based Tool for Image Annotation , 2008, International Journal of Computer Vision.

[20]  Christoph H. Lampert,et al.  Learning to detect unseen object classes by between-class attribute transfer , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Thomas G. Dietterich,et al.  Improving SVM accuracy by training on auxiliary data sources , 2004, ICML.

[22]  Luc Van Gool,et al.  The 2005 PASCAL Visual Object Classes Challenge , 2005, MLCW.

[23]  Antonio Torralba,et al.  Learning hierarchical models of scenes, objects, and parts , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[24]  Alexei A. Efros,et al.  An empirical study of context in object detection , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Lawrence Carin,et al.  Logistic regression with an auxiliary data source , 2005, ICML.

[26]  Antonio Torralba,et al.  Part and appearance sharing: Recursive Compositional Models for multi-view , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[27]  Shimon Ullman,et al.  Cross-generalization: learning novel classes from a single example by feature replacement , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[28]  Narendra Ahuja,et al.  Learning subcategory relevances for category recognition , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[30]  Jianguo Zhang,et al.  The PASCAL Visual Object Classes Challenge , 2006 .

[31]  Greg Mori,et al.  Detecting Pedestrians by Learning Shapelet Features , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[32]  Charless C. Fowlkes,et al.  Discriminative Models for Multi-Class Object Layout , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[33]  Dan Roth,et al.  Learning to detect objects in images via a sparse, part-based representation , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.