Real-time Accurate Object Detection using Multiple Resolutions

We propose a multi-resolution framework inspired by human visual search for general object detection. Different resolutions are represented using a coarse-to-fine feature hierarchy. During detection, the lower resolution features are initially used to reject the majority of negative windows at relatively low cost, leaving a relatively small number of windows to be processed in higher resolutions. This enables the use of computationally more expensive higher resolution features to achieve high detection accuracy. We applied this framework on Histograms of Oriented Gradient (HOG) features for object detection. Our multi-resolution detector produced better performance for pedestrian detection than state-of-the-art methods (Dalal and Triggs, 2005), and was faster during both training and testing. Testing our method on motorbikes and cars from the VOC database revealed similar improvements in both speed and accuracy, suggesting that our approach is suitable for realtime general object detection applications.

[1]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[2]  Andrew Blake,et al.  Contour-based learning for object detection , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[3]  Rajesh P. N. Rao,et al.  Eye movements in iconic visual search , 2002, Vision Research.

[4]  Huitao Luo,et al.  Optimization design of cascaded classifiers , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[5]  Dorin Comaniciu,et al.  Mean shift analysis and applications , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[6]  Luc Van Gool,et al.  The 2005 PASCAL Visual Object Classes Challenge , 2005, MLCW.

[7]  Shimon Ullman,et al.  Feature hierarchies for object classification , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[8]  Tomaso A. Poggio,et al.  A Trainable System for Object Detection , 2000, International Journal of Computer Vision.

[9]  Mei-Chen Yeh,et al.  Fast Human Detection Using a Cascade of Histograms of Oriented Gradients , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[10]  Simone Frintrop,et al.  Robust Object Detection at Regions of Interest with an Application in Ball Recognition , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.

[11]  Bernt Schiele,et al.  Integrating representative and discriminant models for object category detection , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[12]  Trevor Darrell,et al.  The pyramid match kernel: discriminative classification with sets of image features , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[13]  Takeo Kanade,et al.  Object Detection Using the Statistics of Parts , 2004, International Journal of Computer Vision.

[14]  Andrew Zisserman,et al.  A Boundary-Fragment-Model for Object Detection , 2006, ECCV.

[15]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[16]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[17]  Wilson S. Geisler,et al.  Gaze-contingent real-time simulation of arbitrary visual fields , 2002, IS&T/SPIE Electronic Imaging.

[18]  Donald Geman,et al.  Coarse-to-Fine Face Detection , 2004, International Journal of Computer Vision.

[19]  Navneet Dalal,et al.  Finding People in Images and Videos , 2006 .

[20]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[21]  Ankur Agarwal,et al.  Hyperfeatures - Multilevel Local Coding for Visual Recognition , 2006, ECCV.

[22]  Manuela M. Veloso,et al.  Real-time object detection using segmented and grayscale images , 2006, Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006..

[23]  Bernt Schiele,et al.  Pedestrian detection in crowded scenes , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[24]  Bernt Schiele,et al.  Multiple Object Class Detection with a Generative Model , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[25]  Andrew Zisserman,et al.  Incremental learning of object detectors using a visual shape alphabet , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[26]  Yali Amit,et al.  A coarse-to-fine strategy for multiclass shape detection , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Paul A. Viola,et al.  Detecting Pedestrians Using Patterns of Motion and Appearance , 2005, International Journal of Computer Vision.

[28]  Luc Van Gool,et al.  Object Detection by Contour Segment Networks , 2006, ECCV.

[29]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[30]  Tomaso A. Poggio,et al.  Example-Based Object Detection in Images by Components , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[31]  James M. Rehg,et al.  Towards Optimal Training of Cascaded Detectors , 2006, ECCV.