Local Context Priors for Object Proposal Generation

State-of-the-art methods for object detection are mostly based on an expensive exhaustive search over the image at different scales. In order to reduce the computational time, one can perform a selective search to obtain a small subset of relevant object hypotheses that need to be evaluated by the detector. For that purpose, we employ a regression to predict possible object scales and locations by exploiting the local context of an image. Furthermore, we show how a priori information, if available, can be integrated to improve the prediction. The experimental results on three datasets including the Caltech pedestrian and PASCAL VOC dataset show that our method achieves the detection performance of an exhaustive search approach with much less computational load. Since we model the prior distribution over the proposals locally, it generalizes well and can be successfully applied across datasets.

[1]  William T. Freeman,et al.  Latent hierarchical structural learning for object detection , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[2]  Andrew W. Fitzgibbon,et al.  Efficient regression of general-activity human poses from depth images , 2011, 2011 International Conference on Computer Vision.

[3]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[4]  Pietro Perona,et al.  Pedestrian Detection: An Evaluation of the State of the Art , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Andrew Blake,et al.  Computationally efficient face detection , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[6]  Luc Van Gool,et al.  Branch&Rank: Non-Linear Object Detection , 2011, BMVC.

[7]  Andrew Zisserman,et al.  Multiple kernels for object detection , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[8]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Franklin C. Crow,et al.  Summed-area tables for texture mapping , 1984, SIGGRAPH.

[10]  Wei Zhang,et al.  Real-time Accurate Object Detection using Multiple Resolutions , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[11]  Tsuhan Chen,et al.  Extracting adaptive contextual cues from unlabeled regions , 2011, 2011 International Conference on Computer Vision.

[12]  David A. McAllester,et al.  Cascade object detection with deformable part models , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[13]  Zhuowen Tu,et al.  Feature Mining for Image Classification , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Antonio Criminisi,et al.  Regression Forests for Efficient Anatomy Detection and Localization in CT Studies , 2010, MCV.

[15]  Axel Pinz,et al.  Computer Vision – ECCV 2006 , 2006, Lecture Notes in Computer Science.

[16]  Alexei A. Efros,et al.  Putting Objects in Perspective , 2006, CVPR.

[17]  Matthew B. Blaschko,et al.  Learning a category independent object detection cascade , 2011, 2011 International Conference on Computer Vision.

[18]  James M. Rehg,et al.  Towards Optimal Training of Cascaded Detectors , 2006, ECCV.

[19]  Luc Van Gool,et al.  Hough Forests for Object Detection, Tracking, and Action Recognition , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Alexei A. Efros,et al.  An empirical study of context in object detection , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[22]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[23]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[24]  Thomas Deselaers,et al.  What is an object? , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[25]  Frédo Durand,et al.  Learning to predict where humans look , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[26]  Andrew Zisserman,et al.  An Exemplar Model for Learning Object Classes , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Thomas Deselaers,et al.  ClassCut for Unsupervised Class Segmentation , 2010, ECCV.

[28]  Rita Cucchiara,et al.  Multi-stage Sampling with Boosting Cascades for Pedestrian Detection in Images and Videos , 2010, ECCV.

[29]  Jonathan Warrell,et al.  Proposal generation for object detection using cascaded ranking SVMs , 2011, CVPR 2011.

[30]  Pietro Perona,et al.  Integral Channel Features , 2009, BMVC.

[31]  Christoph H. Lampert An efficient divide-and-conquer cascade for nonlinear object detection , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[32]  Antonio Torralba,et al.  Using the forest to see the trees: exploiting context for visual object detection and localization , 2010, CACM.

[33]  Koen E. A. van de Sande,et al.  Segmentation as selective search for object recognition , 2011, 2011 International Conference on Computer Vision.

[34]  Christoph H. Lampert,et al.  Efficient Subwindow Search: A Branch and Bound Framework for Object Localization , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  Andrew Y. Ng,et al.  A Steiner tree approach to efficient object detection , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[36]  Jordi Gonzàlez,et al.  A coarse-to-fine approach for fast deformable object detection , 2011, CVPR 2011.

[37]  Luc Van Gool,et al.  Real time head pose estimation with random regression forests , 2011, CVPR 2011.

[38]  Ali Farhadi,et al.  Recognition using visual phrases , 2011, CVPR 2011.

[39]  Luc Van Gool,et al.  Dynamic 3D Scene Analysis from a Moving Vehicle , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[40]  Georg Langs,et al.  Medical Computer Vision , 2011 .