论文信息 - Seeking the Strongest Rigid Detector

Seeking the Strongest Rigid Detector

The current state of the art solutions for object detection describe each class by a set of models trained on discovered sub-classes (so called "components"), with each model itself composed of collections of interrelated parts (deformable models). These detectors build upon the now classic Histogram of Oriented Gradients+linear SVM combo. In this paper we revisit some of the core assumptions in HOG+SVM and show that by properly designing the feature pooling, feature selection, preprocessing, and training methods, it is possible to reach top quality, at least for pedestrian detections, using a single rigid component. Abstract We provide experiments for a large design space, that give insights into the design of classifiers, as well as relevant information for practitioners. Our best detector is fully feed-forward, has a single unified architecture, uses only histograms of oriented gradients and colour information in monocular static images, and improves over 23 other methods on the INRIA, ETH and Caltech-USA datasets, reducing the average miss-rate over HOG+SVM by more than 30%.

[1] Carlo Gatta,et al. A new algorithm for unsupervised global and local color correction , 2003, Pattern Recognit. Lett..

[2] Paul A. Viola,et al. Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[3] Bill Triggs,et al. Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[4] Yann LeCun,et al. What is the best multi-stage architecture for object recognition? , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[5] B. Schiele,et al. Multi-cue onboard pedestrian detection , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[6] David A. McAllester,et al. Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7] Pietro Perona,et al. Integral Channel Features , 2009, BMVC.

[8] Vincent Lepetit,et al. Fast Keypoint Recognition Using Random Ferns , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9] Charless C. Fowlkes,et al. Multiresolution Models for Object Detection , 2010, ECCV.

[10] Bohyung Han,et al. Improving object localization using macrofeature layout selection , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[11] Tony Jebara,et al. Variance Penalizing AdaBoost , 2011, NIPS.

[12] Pietro Perona,et al. Pedestrian Detection: An Evaluation of the State of the Art , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13] Charless C. Fowlkes,et al. Do We Need More Training Data or Better Models for Object Detection? , 2012, BMVC.

[14] Pascal Fua,et al. A Real-Time Deformable Detector , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15] Trevor Darrell,et al. Beyond spatial pyramids: Receptive field learning for pooled image features , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[16] Luc Van Gool,et al. Pedestrian detection at 100 frames per second , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[17] Piotr Dollár,et al. Crosstalk Cascades for Frame-Rate Pedestrian Detection , 2012, ECCV.