A coarse-to-fine approach for fast deformable object detection

We present a method that can dramatically accelerate object detection with part based models. The method is based on the observation that the cost of detection is likely dominated by the cost of matching each part to the image, and not by the cost of computing the optimal configuration of the parts as commonly assumed. To minimize the number of part-to-image comparisons we propose a multiple-resolutions hierarchical part-based model and a corresponding coarse-to-fine inference procedure that recursively eliminates from the search space unpromising part placements. The method yields a ten-fold speedup over the standard dynamic programming approach and, combined with the cascade-of-parts approach, a hundred-fold speedup in some cases. We evaluate our method extensively on the PASCAL VOC and INRIA datasets, demonstrating a very high increase in the detection speed with little degradation of the accuracy. Graphical abstractDisplay Omitted HighlightsNew method for fast deformable object detection.The cost of detection is dominated by the cost of matching the model parts.Multiresolution part-based model with a fast coarse-to-fine inference.Lateral connections among parts helps to maintain a coherent object representation.Our fast inference can be combined with cascades to multiply the speed-up.

[1]  Christoph H. Lampert,et al.  Learning to Localize Objects with Structured Output Regression , 2008, ECCV.

[2]  Jordi Gonzàlez,et al.  A coarse-to-fine approach for fast deformable object detection , 2011, CVPR 2011.

[3]  Pietro Perona,et al.  Integral Channel Features , 2009, BMVC.

[4]  Alejandro F. Frangi,et al.  Haar-like features with optimally weighted rectangles for rapid object detection , 2010, Pattern Recognition.

[5]  Ben Taskar,et al.  Cascaded Models for Articulated Pose Estimation , 2010, ECCV.

[6]  Trevor Darrell,et al.  Sparselet Models for Efficient Multiclass Object Detection , 2012, ECCV.

[7]  Subhransu Maji,et al.  Classification using intersection kernel support vector machines is efficient , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Michael Elad,et al.  Rejection based classifier for face detection , 2002, Pattern Recognit. Lett..

[9]  Yali Amit,et al.  Shape Quantization and Recognition with Randomized Trees , 1997, Neural Computation.

[10]  Koen E. A. van de Sande,et al.  Selective Search for Object Recognition , 2013, International Journal of Computer Vision.

[11]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[12]  In-So Kweon,et al.  Fast object recognition using dynamic programming from combination of salient line groups , 2003, Pattern Recognit..

[13]  Thomas Deselaers,et al.  Measuring the Objectness of Image Windows , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  William T. Freeman,et al.  Latent hierarchical structural learning for object detection , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[15]  Daijin Kim,et al.  Real-time Object Recognition using Relational Dependency based on Graphical Model , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[16]  Bernt Schiele,et al.  A Performance Evaluation of Single and Multi-feature People Detection , 2008, DAGM-Symposium.

[17]  Pietro Perona,et al.  Unsupervised Learning of Models for Recognition , 2000, ECCV.

[18]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Thorsten Joachims,et al.  Learning structural SVMs with latent variables , 2009, ICML '09.

[20]  François Fleuret,et al.  Exact Acceleration of Linear Object Detectors , 2012, ECCV.

[21]  Bernt Schiele,et al.  New features and insights for pedestrian detection , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[22]  David A. Forsyth,et al.  30Hz Object Detection with DPM V5 , 2014, ECCV.

[23]  Piotr Dollár,et al.  Crosstalk Cascades for Frame-Rate Pedestrian Detection , 2012, ECCV.

[24]  Daniel P. Huttenlocher,et al.  Pictorial Structures for Object Recognition , 2004, International Journal of Computer Vision.

[25]  Jordi Gonzàlez,et al.  Recursive Coarse-to-Fine Localization for Fast Object Detection , 2010, ECCV.

[26]  Andrew Zisserman,et al.  Multiple kernels for object detection , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[27]  Larry S. Davis,et al.  Human detection using partial least squares analysis , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[28]  Jonathon Shlens,et al.  Fast, Accurate Detection of 100,000 Object Classes on a Single Machine , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  David A. McAllester,et al.  Cascade object detection with deformable part models , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[30]  Zhuowen Tu,et al.  Feature Mining for Image Classification , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Rita Cucchiara,et al.  Multi-stage Sampling with Boosting Cascades for Pedestrian Detection in Images and Videos , 2010, ECCV.

[32]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[33]  Longin Jan Latecki,et al.  Contour-based object detection as dominant set computation , 2012, Pattern Recognit..

[34]  Michael Elad,et al.  Pattern detection using a maximal rejection classifier , 2000, 21st IEEE Convention of the Electrical and Electronic Engineers in Israel. Proceedings (Cat. No.00EX377).

[35]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[36]  Pietro Perona,et al.  The Fastest Pedestrian Detector in the West , 2010, BMVC.

[37]  Ben Taskar,et al.  Sidestepping Intractable Inference with Structured Ensemble Cascades , 2010, NIPS.

[38]  Martin A. Fischler,et al.  The Representation and Matching of Pictorial Structures , 1973, IEEE Transactions on Computers.

[39]  Christoph H. Lampert,et al.  Beyond sliding windows: Object localization by efficient subwindow search , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[40]  Junjie Yan,et al.  The Fastest Deformable Part Model for Object Detection , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.