How Important Are "Deformable Parts" in the Deformable Parts Model?

The Deformable Parts Model (DPM) has recently emerged as a very useful and popular tool for tackling the intra-category diversity problem in object detection. In this paper, we summarize the key insights from our empirical analysis of the important elements constituting this detector. More specifically, we study the relationship between the role of deformable parts and the mixture model components within this detector, and understand their relative importance. First, we find that by increasing the number of components, and switching the initialization step from their aspect-ratio, left-right flipping heuristics to appearance-based clustering, considerable improvement in performance is obtained. But more intriguingly, we observed that with these new components, the part deformations can now be turned off, yet obtaining results that are almost on par with the original DPM detector.

[1]  Eleanor Rosch,et al.  Principles of Categorization , 1978 .

[2]  Geoffrey E. Hinton,et al.  Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[3]  Kathy E. Johnson,et al.  Effects of knowledge and development on subordinate level categorization , 1998 .

[4]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[5]  Takeo Kanade,et al.  A statistical method for 3D object detection applied to faces and cars , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[6]  Nathalie Japkowicz,et al.  Supervised Learning with Unsupervised Output Separation , 2002 .

[7]  Dale Schuurmans,et al.  Maximum Margin Clustering , 2004, NIPS.

[8]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[9]  Christoph F. Eick,et al.  Piece-Wise Model Fitting Using Local Data Patterns , 2004, ECAI.

[10]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[11]  Jianguo Zhang,et al.  The PASCAL Visual Object Classes Challenge , 2006 .

[12]  Bernt Schiele,et al.  Multi-Aspect Detection of Articulated Objects , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[13]  Andrew Zisserman,et al.  An Exemplar Model for Learning Object Classes , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Arnold P. Boedihardjo,et al.  On Locally Linear Classification by Pairwise Coupling , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[15]  Cordelia Schmid,et al.  Classification aided two stage localization , 2008 .

[16]  Zhouyu Fu,et al.  On Mixtures of Linear SVMs for Nonlinear Classification , 2008, SSPR/SPR.

[17]  David A. McAllester,et al.  A discriminatively trained, multiscale, deformable part model , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Dmitriy Fradkin,et al.  Clustering Inside Classes Improves Performance of Linear Classifiers , 2008, 2008 20th IEEE International Conference on Tools with Artificial Intelligence.

[19]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[20]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[21]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Xiaofeng Ren,et al.  Discriminative Mixture-of-Templates for Viewpoint Classification , 2010, ECCV.

[23]  Charless C. Fowlkes,et al.  Multiresolution Models for Object Detection , 2010, ECCV.

[24]  Krista A. Ehinger,et al.  SUN database: Large-scale scene recognition from abbey to zoo , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[25]  Daphne Koller,et al.  Self-Paced Learning for Latent Variable Models , 2010, NIPS.

[26]  Subhransu Maji,et al.  Detecting People Using Mutually Consistent Poselet Activations , 2010, ECCV.

[27]  Yang Wang,et al.  Hidden Part Models for Human Action Recognition: Probabilistic versus Max Margin , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Pedro F. Felzenszwalb Object detection grammars , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[29]  George Toderici,et al.  Discriminative tag learning on YouTube videos with latent sub-tags , 2011, CVPR 2011.

[30]  Thomas Deselaers,et al.  Visual and semantic similarity in ImageNet , 2011, CVPR 2011.

[31]  Alexei A. Efros,et al.  Ensemble of exemplar-SVMs for object detection and beyond , 2011, 2011 International Conference on Computer Vision.

[32]  Vittorio Ferrari,et al.  Exploiting spatial overlap to efficiently compute appearance distances between image windows , 2011, NIPS 2011.