Scalable multi-class object detection

Scalability of object detectors with respect to the number of classes is a very important issue for applications where many object classes need to be detected. While combining single-class detectors yields a linear complexity for testing, multi-class detectors that localize all objects at once come often at the cost of a reduced detection accuracy. In this work, we present a scalable multi-class detection algorithm which scales sublinearly with the number of classes without compromising accuracy. To this end, a shared discriminative codebook of feature appearances is jointly trained for all classes and detection is also performed for all classes jointly. Based on the learned sharing distributions of features among classes, we build a taxonomy of object classes. The taxonomy is then exploited to further reduce the cost of multi-class object detection. Our method has linear training and sublinear detection complexity in the number of classes. We have evaluated our method on the challenging PASCAL VOC'06 and PASCAL VOC'07 datasets and show that scaling the system does not lead to a loss in accuracy.

[1]  R. Stephenson A and V , 1962, The British journal of ophthalmology.

[2]  S. Kosslyn,et al.  Why are What and Where Processed by Separate Cortical Visual Systems? A Computational Investigation , 1989, Journal of Cognitive Neuroscience.

[3]  Leslie G. Ungerleider,et al.  ‘What’ and ‘where’ in the human brain , 1994, Current Opinion in Neurobiology.

[4]  Erica D. Palmer,et al.  Visual Representation of Object Location: Insights From Localization Impairments , 1996 .

[5]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[6]  Michel Vidal-Naquet,et al.  Visual features of intermediate complexity and their use in classification , 2002, Nature Neuroscience.

[7]  Pietro Perona,et al.  Object class recognition by unsupervised scale-invariant learning , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[8]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[9]  Sanja Fidler,et al.  Towards Scalable Representations of Object Categories: Learning a Hierarchy of Parts , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Bernt Schiele,et al.  Robust Object Detection with Interleaved Categorization and Segmentation , 2008, International Journal of Computer Vision.

[11]  Antonio Torralba,et al.  Sharing Visual Features for Multiclass and Multiview Object Detection , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Pietro Perona,et al.  Learning and using taxonomies for fast visual categorization , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Eli Shechtman,et al.  In defense of Nearest-Neighbor based image classification , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Roberto Cipolla,et al.  Semantic texton forests for image categorization and segmentation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Cordelia Schmid,et al.  Constructing Category Hierarchies for Visual Recognition , 2008, ECCV.

[16]  Jitendra Malik,et al.  Object detection using a max-margin Hough transform , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[18]  Juergen Gall,et al.  Class-specific Hough forests for object detection , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[20]  Jitendra Malik,et al.  Multi-scale object detection by clustering lines , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[21]  Sebastian Nowozin,et al.  On feature combination for multiclass object classification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[22]  Reducing JointBoost-based multiclass classification to proximity search , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Tsuhan Chen,et al.  Implicit Shape Kernel for Discriminative Learning of the Hough Transform Detector , 2010, BMVC.

[25]  Luc Van Gool,et al.  Backprojection Revisited: Scalable Multi-view Object Detection and Similarity Metrics for Detections , 2010, ECCV.

[26]  Thomas Deselaers,et al.  What is an object? , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[27]  Luc Van Gool,et al.  Fast PRISM: Branch and Bound Hough Transform for Object Class Detection , 2011, International Journal of Computer Vision.

[28]  Cristian Sminchisescu,et al.  Constrained parametric min-cuts for automatic object segmentation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[29]  Charless C. Fowlkes,et al.  Multiresolution Models for Object Detection , 2010, ECCV.

[30]  Sanja Fidler,et al.  A Coarse-to-Fine Taxonomy of Constellations for Fast Multi-class Object Detection , 2010, ECCV.

[31]  Derek Hoiem,et al.  Category Independent Object Proposals , 2010, ECCV.

[32]  Antonio Torralba,et al.  Semantic Label Sharing for Learning with Many Categories , 2010, ECCV.