论文信息 - Branch&Rank for Efficient Object Detection

Branch&Rank for Efficient Object Detection

Ranking hypothesis sets is a powerful concept for efficient object detection. In this work, we propose a branch&rank scheme that detects objects with often less than 100 ranking operations. This efficiency enables the use of strong and also costly classifiers like non-linear SVMs with RBF-$$\chi ^2$$χ2 kernels. We thereby relieve an inherent limitation of branch&bound methods as bounds are often not tight enough to be effective in practice. Our approach features three key components: a ranking function that operates on sets of hypotheses and a grouping of these into different tasks. Detection efficiency results from adaptively sub-dividing the object search space into decreasingly smaller sets. This is inherited from branch&bound, while the ranking function supersedes a tight bound which is often unavailable (except for rather limited function classes). The grouping makes the system effective: it separates image classification from object recognition, yet combines them in a single formulation, phrased as a structured SVM problem. A novel aspect of branch&rank is that a better ranking function is expected to decrease the number of classifier calls during detection. We use the VOC’07 dataset to demonstrate the algorithmic properties of branch&rank.

Luc Van Gool | Peter V. Gehler | Alain D. Lehmann

[1] Antonio Torralba,et al. Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[2] H. Damasio,et al. IEEE Transactions on Pattern Analysis and Machine Intelligence: Special Issue on Perceptual Organization in Computer Vision , 1998 .

[3] Andrew Zisserman,et al. An Exemplar Model for Learning Object Classes , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[4] Bill Triggs,et al. Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[5] Christoph H. Lampert,et al. Learning to Localize Objects with Structured Output Regression , 2008, ECCV.

[6] Christoph H. Lampert,et al. Object Localization with Global and Local Context Kernels , 2009, BMVC.

[7] Juergen Gall,et al. Class-specific Hough forests for object detection , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[8] Cristian Sminchisescu,et al. Constrained parametric min-cuts for automatic object segmentation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[9] David A. McAllester,et al. Cascade object detection with deformable part models , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[10] Andrew Zisserman,et al. Multiple kernels for object detection , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[11] Antonio Torralba,et al. Contextual Priming for Object Detection , 2003, International Journal of Computer Vision.

[12] Charless C. Fowlkes,et al. Multiresolution Models for Object Detection , 2010, ECCV.

[13] G LoweDavid,et al. Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[14] Geoffrey E. Hinton,et al. Learning Generative Texture Models with extended Fields-of-Experts , 2009, BMVC.

[15] Charless C. Fowlkes,et al. Discriminative Models for Multi-Class Object Layout , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[16] Ben Taskar,et al. Sidestepping Intractable Inference with Structured Ensemble Cascades , 2010, NIPS.

[17] Luc Van Gool,et al. Branch&Rank: Non-Linear Object Detection , 2011, BMVC.

[18] Christof Koch,et al. A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[19] Thomas Hofmann,et al. Large Margin Methods for Structured and Interdependent Output Variables , 2005, J. Mach. Learn. Res..

[20] Christoph H. Lampert,et al. Structured prediction by joint kernel support estimation , 2009, Machine Learning.

[21] Philip H. S. Torr,et al. Learning Anchor Planes for Classification , 2011, NIPS.

[22] Jonathan Warrell,et al. Proposal generation for object detection using cascaded ranking SVMs , 2011, CVPR 2011.

[23] Cordelia Schmid,et al. Combining efficient object localization and image classification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[24] Antonio Torralba,et al. Using the Forest to See the Trees: A Graphical Model Relating Features, Objects, and Scenes , 2003, NIPS.

[25] Jean Ponce,et al. A Theoretical Analysis of Feature Pooling in Visual Recognition , 2010, ICML.

[26] T. Breuel,et al. Electronic Letters on Computer Vision and Image Analysis 6(1):44-54, 2007 Optimal Geometric Matching for Patch-Based Object Detection , 2006 .

[27] Paul A. Viola,et al. Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[28] Yichen Wei,et al. Efficient histogram-based sliding window , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[29] Trevor Darrell,et al. Fast concurrent object localization and recognition , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[30] G. Griffin,et al. Caltech-256 Object Category Dataset , 2007 .

[31] Lior Wolf,et al. A Critical View of Context , 2006, International Journal of Computer Vision.

[32] Luc Van Gool,et al. Feature-centric Efficient Subwindow Search , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[33] Léon Bottou,et al. The Tradeoffs of Large Scale Learning , 2007, NIPS.

[34] David G. Lowe,et al. Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[35] Ian Reid,et al. fastHOG – a real-time GPU implementation of HOG , 2011 .

[36] Thomas M. Breuel. A Comparison of Search Strategies for Geometric Branch and Bound Algorithms , 2002, ECCV.

[37] Thomas Deselaers,et al. What is an object? , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[38] Luc Van Gool,et al. Scalable multi-class object detection , 2011, CVPR 2011.

[39] Stephen E. Robertson,et al. Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval , 1994, SIGIR '94.

[40] Koen E. A. van de Sande,et al. Evaluating Color Descriptors for Object and Scene Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41] ZissermanAndrew,et al. The Pascal Visual Object Classes Challenge , 2015 .

[42] Luc Van Gool,et al. Fast PRISM: Branch and Bound Hough Transform for Object Class Detection , 2011, International Journal of Computer Vision.

[43] Gregory N. Hullender,et al. Learning to rank using gradient descent , 2005, ICML.

[44] Alain D. Lehmann. Efficient Object Detection , 2011 .

[45] Sebastian Nowozin,et al. On feature combination for multiclass object classification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[46] Matthew B. Blaschko,et al. Simultaneous Object Detection and Ranking with Weak Supervision , 2010, NIPS.

[47] Lior Wolf,et al. A Unified System For Object Detection, Texture Recognition, and Context Analysis Based on the Standard Model Feature Set , 2005, BMVC.

[48] Hyung Jeong Yang,et al. Recursive Coarse-to-Fine Localization for Fast Object Detection , 2014 .

[49] S. Sathiya Keerthi,et al. Efficient algorithms for ranking with SVMs , 2010, Information Retrieval.

[50] Christoph H. Lampert. An efficient divide-and-conquer cascade for nonlinear object detection , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[51] Matthew B. Blaschko. Branch and Bound Strategies for Non-maximal Suppression in Object Detection , 2011, EMMCVPR.

[52] Cordelia Schmid,et al. Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[53] Antonio Torralba,et al. Using the forest to see the trees: exploiting context for visual object detection and localization , 2010, CACM.

[54] Jianguo Zhang,et al. The PASCAL Visual Object Classes Challenge , 2006 .

[55] Bernt Schiele,et al. Sliding-Windows for Rapid Object Class Localization: A Parallel Technique , 2008, DAGM-Symposium.

[56] Christoph H. Lampert,et al. Efficient Subwindow Search: A Branch and Bound Framework for Object Localization , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[57] David A. McAllester,et al. A discriminatively trained, multiscale, deformable part model , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[58] Philip H. S. Torr,et al. Locally Linear Support Vector Machines , 2011, ICML.

[59] Donald Geman,et al. A Design Principle for Coarse-to-Fine Classification , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).