Accelerating Deformable Part Models with Branch-and-Bound

Deformable Part Models (DPMs) play a prominent role in current object recognition research, as they rigorously model the shape variability of an object category by breaking an object into parts and modelling the relative locations of the parts. Still, inference with such models requires solving a combinatorial optimization task. In this chapter, we will see how Branch-and-Bound can be used to efficiently perform inference with such models. Instead of evaluating the classifier score exhaustively for all part locations and scales, such techniques allow us to quickly focus on promising image locations. The core problem that we will address is how to compute bounds that accommodate part deformations; this allows us to apply Branch-and-Bound to our problem. When comparing to a baseline DPM implementation, we obtain exactly the same results but can perform the part combination substantially faster, yielding up to tenfold speedups for single object detection, or even higher speedups for multiple objects.

[1]  Jonathon Shlens,et al.  Fast, Accurate Detection of 100,000 Object Classes on a Single Machine , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Iasonas Kokkinos,et al.  Inference and Learning with Hierarchical Shape Models , 2011, International Journal of Computer Vision.

[4]  Deva Ramanan,et al.  Face detection, pose estimation, and landmark localization in the wild , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Jitendra Malik,et al.  Shape matching and object recognition using shape contexts , 2010, 2010 3rd International Conference on Computer Science and Information Technology.

[6]  Daniel P. Huttenlocher,et al.  Pictorial Structures for Object Recognition , 2004, International Journal of Computer Vision.

[7]  Iasonas Kokkinos,et al.  Deformable Part Models with CNN Features , 2014, ECCV 2014.

[8]  Iasonas Kokkinos,et al.  Segmentation-Aware Deformable Part Models , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Pietro Perona,et al.  Object class recognition by unsupervised scale-invariant learning , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[10]  Deva Ramanan,et al.  Steerable part models , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Long Zhu,et al.  Rapid Inference on a Novel AND/OR graph for Object Detection, Segmentation and Parsing , 2007, NIPS.

[12]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[13]  Christoph H. Lampert,et al.  Beyond sliding windows: Object localization by efficient subwindow search , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Iasonas Kokkinos,et al.  Rapid Mode Estimation for 3D Brain MRI Tumor Segmentation , 2013, EMMCVPR.

[15]  Andrew Zisserman,et al.  Progressive search space reduction for human pose estimation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Yi Yang,et al.  Articulated Human Detection with Flexible Mixtures of Parts , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Pietro Perona,et al.  Recognition by Probabilistic Hypothesis Construction , 2004, ECCV.

[18]  Iasonas Kokkinos,et al.  Shufflets: Shared Mid-level Parts for Fast Multi-Category Detection , 2013, ICCV 2013.

[19]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[20]  François Fleuret,et al.  Exact Acceleration of Linear Object Detectors , 2012, ECCV.

[21]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[22]  David A. Forsyth,et al.  3D Object Recognition Using Invariance , 1995, Artif. Intell..

[23]  Jitendra Malik,et al.  Deformable part models are convolutional neural networks , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  David G. Lowe,et al.  Perceptual Organization and Visual Recognition , 2012 .

[25]  Daniel P. Huttenlocher,et al.  Comparing Images Using the Hausdorff Distance , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[26]  David A. McAllester,et al.  A discriminatively trained, multiscale, deformable part model , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Andrew Zisserman,et al.  Geometric invariance in computer vision , 1992 .

[28]  William T. Freeman,et al.  Efficient Multiscale Sampling from Products of Gaussian Mixtures , 2003, NIPS.

[29]  David A. McAllester,et al.  Cascade object detection with deformable part models , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[30]  Li Wan,et al.  End-to-end integration of a Convolutional Network, Deformable Parts Model and non-maximum suppression , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Andrew Zisserman,et al.  Sparse kernel approximations for efficient classification and detection , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[32]  Trevor Darrell,et al.  Sparselet Models for Efficient Multiclass Object Detection , 2012, ECCV.

[33]  Andrew Blake,et al.  Image Segmentation by Branch-and-Mincut , 2008, ECCV.

[34]  Jordi Gonzàlez,et al.  A coarse-to-fine approach for fast deformable object detection , 2011, CVPR 2011.

[35]  Iasonas Kokkinos,et al.  Fast and Exact: ADMM-Based Discriminative Shape Segmentation with Loopy Part Models , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  David A. Forsyth,et al.  Fast Template Evaluation with Vector Quantization , 2013, NIPS.

[37]  Iasonas Kokkinos,et al.  Untangling Local and Global Deformations in Deep Convolutional Networks for Image Classification and Sliding Window Detection , 2014, ArXiv.

[38]  Ben Taskar,et al.  Cascaded Models for Articulated Pose Estimation , 2010, ECCV.

[39]  Donald Geman,et al.  Coarse-to-Fine Face Detection , 2004, International Journal of Computer Vision.

[40]  Iasonas Kokkinos,et al.  Rapid Deformable Object Detection using Dual-Tree Branch-and-Bound , 2011, NIPS.

[41]  Christoph H. Lampert An efficient divide-and-conquer cascade for nonlinear object detection , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[42]  William Grimson,et al.  Object recognition by computer - the role of geometric constraints , 1991 .

[43]  Daniel P. Huttenlocher,et al.  Distance Transforms of Sampled Functions , 2012, Theory Comput..

[44]  Pietro Perona,et al.  Unsupervised Learning of Models for Recognition , 2000, ECCV.

[45]  Andrew W. Moore,et al.  Nonparametric Density Estimation: Toward Computational Tractability , 2003, SDM.

[46]  Luc Van Gool,et al.  Fast PRISM: Branch and Bound Hough Transform for Object Class Detection , 2011, International Journal of Computer Vision.

[47]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[48]  Iasonas Kokkinos Bounding Part Scores for Rapid Detection with Deformable Part Models , 2012, ECCV Workshops.