Hard negative classes for multiple object detection

We propose an efficient method to train multiple object detectors simultaneously using a large scale image dataset. The one-vs-all approach that optimizes the boundary between positive samples from a target class and negative samples from the others has been the most standard approach for object detection. However, because this approach trains each object detector independently, the scores are not balanced between object classes. The proposed method combines ideas derived from both detection and classification in order to balance the scores across all object classes. We optimized the boundary between target classes and their “hard negative” samples, just as in detection, while simultaneously balancing the detector scores across object classes, as done in multi-class classification. We evaluated the performances on multi-class object detection using a subset of the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) 2011 dataset and showed our method outperformed a de facto standard method.

[1]  Gert R. G. Lanckriet,et al.  Contextual Object Localization With Multiple Kernel Nearest Neighbor , 2011, IEEE Transactions on Image Processing.

[2]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[3]  Jonathon Shlens,et al.  Fast, Accurate Detection of 100,000 Object Classes on a Single Machine , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Charless C. Fowlkes,et al.  Discriminative Models for Multi-Class Object Layout , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[5]  Christoph H. Lampert,et al.  A Multiple Kernel Learning Approach to Joint Multi-class Object Detection , 2008, DAGM-Symposium.

[6]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[7]  Antonio Torralba,et al.  Sharing features: efficient boosting procedures for multiclass object detection , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[8]  Antonio Criminisi,et al.  TextonBoost for Image Understanding: Multi-Class Object Recognition and Segmentation by Jointly Modeling Texture, Layout, and Context , 2007, International Journal of Computer Vision.

[9]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Ali Farhadi,et al.  Recognition using visual phrases , 2011, CVPR 2011.

[11]  Jianguo Zhang,et al.  The PASCAL Visual Object Classes Challenge , 2006 .

[12]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[13]  ZissermanAndrew,et al.  The Pascal Visual Object Classes Challenge , 2015 .

[14]  Andrea Vedaldi,et al.  Objects in Context , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[15]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[16]  Tong Lu,et al.  Multiclass object detection by combining local appearances and context , 2011, ACM Multimedia.

[17]  Jens Rittscher,et al.  Multi-class Object Layout with Unsupervised Image Classification and Object Localization , 2011, ISVC.

[18]  Koby Crammer,et al.  Online Passive-Aggressive Algorithms , 2003, J. Mach. Learn. Res..

[19]  Bernt Schiele,et al.  Multiple Object Class Detection with a Generative Model , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[20]  Hao Su,et al.  Object Bank: A High-Level Image Representation for Scene Classification & Semantic Feature Sparsification , 2010, NIPS.

[21]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[22]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[23]  Antonio Torralba,et al.  Sharing Visual Features for Multiclass and Multiview Object Detection , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.