论文信息 - Immediate, Scalable Object Category Detection

Immediate, Scalable Object Category Detection

The objective of this work is object category detection in large scale image datasets in the manner of Video Google - an object category is specified by a HOG classifier template, and retrieval is immediate at run time. We make the following three contributions: (i) a new image representation based on mid-level discriminative patches, that is designed to be suited to immediate object category detection and inverted file indexing, (ii) a sparse representation of a HOG classifier using a set of mid-level discriminative classifier patches, and (iii) a fast method for spatial reranking images on their detections. We evaluate the detection method on the standard PASCAL VOC 2007 dataset, together with a 100K image subset of ImageNet, and demonstrate near state of the art detection performance at low ranks whilst maintaining immediate retrieval speeds. Applications are also demonstrated using an exemplar-SVM for pose matched retrieval.

Andrew Zisserman | Yusuf Aytar | Andrew Zisserman | Y. Aytar

[1] Alexei A. Efros,et al. Ensemble of exemplar-SVMs for object detection and beyond , 2011, 2011 International Conference on Computer Vision.

[2] Andrew Zisserman,et al. An Exemplar Model for Learning Object Classes , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[3] Alexei A. Efros,et al. Unsupervised Discovery of Mid-Level Discriminative Patches , 2012, ECCV.

[4] C. V. Jawahar,et al. Blocks That Shout: Distinctive Parts for Scene Classification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[5] Andrew Zisserman,et al. Sparse kernel approximations for efficient classification and detection , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[6] Bastian Leibe,et al. Interleaved Object Categorization and Segmentation , 2003, BMVC.

[7] Antonio Torralba,et al. Ieee Transactions on Pattern Analysis and Machine Intelligence 1 80 Million Tiny Images: a Large Dataset for Non-parametric Object and Scene Recognition , 2022 .

[8] Cordelia Schmid,et al. Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search , 2008, ECCV.

[9] Christopher G. Harris,et al. A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[10] Andrew Zisserman,et al. Enhancing Exemplar SVMs using Part Level Transfer Regularization , 2012, BMVC.

[11] David A. McAllester,et al. Cascade object detection with deformable part models , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[12] Michael Isard,et al. Descriptor Learning for Efficient Retrieval , 2010, ECCV.

[13] Trevor Darrell,et al. Discriminatively Activated Sparselets , 2013, ICML.

[14] Cordelia Schmid,et al. Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[15] Iasonas Kokkinos,et al. Rapid Deformable Object Detection using Dual-Tree Branch-and-Bound , 2011, NIPS.

[16] Jonathon Shlens,et al. Fast, Accurate Detection of 100,000 Object Classes on a Single Machine , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[17] Antonio Torralba,et al. Spectral Hashing , 2008, NIPS.

[18] Guillermo Sapiro,et al. Non-local sparse models for image restoration , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[19] Svetlana Lazebnik,et al. Scene recognition and weakly supervised object localization with deformable part-based models , 2011, 2011 International Conference on Computer Vision.

[20] Andrew Zisserman,et al. The devil is in the details: an evaluation of recent feature encoding methods , 2011, BMVC.

[21] Trevor Darrell,et al. Sparselet Models for Efficient Multiclass Object Detection , 2012, ECCV.

[22] Cordelia Schmid,et al. A Comparison of Affine Region Detectors , 2005, International Journal of Computer Vision.

[23] Michael Isard,et al. Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[24] Andrew Zisserman,et al. Multiple queries for large scale specific object retrieval , 2012, BMVC.

[25] Florent Perronnin,et al. Large-scale image retrieval with compressed Fisher vectors , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[26] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[27] Alexei A. Efros,et al. Data-driven visual similarity for cross-domain image matching , 2011, ACM Trans. Graph..

[28] Jitendra Malik,et al. Discriminative Decorrelation for Clustering and Classification , 2012, ECCV.

[29] Jordi Gonzàlez,et al. A coarse-to-fine approach for fast deformable object detection , 2011, CVPR 2011.

[30] David A. Forsyth,et al. Fast Template Evaluation with Vector Quantization , 2013, NIPS.

[31] Pedro F. Felzenszwalb,et al. Reconfigurable models for scene recognition , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[32] Cordelia Schmid,et al. Product Quantization for Nearest Neighbor Search , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33] Svetlana Lazebnik,et al. Locality-sensitive binary codes from shift-invariant kernels , 2009, NIPS.

[34] François Fleuret,et al. Exact Acceleration of Linear Object Detectors , 2012, ECCV.

[35] Cordelia Schmid,et al. Spatial Weighting for Bag-of-Features , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[36] Andrew Zisserman,et al. Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[37] David Nistér,et al. Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).