From Meaningful Contours to Discriminative Object Shape

Shape is a natural, highly prominent characteristic of objects that human vision utilizes everyday. But despite its expressiveness, shape poses significant challenges for category-level object detection in cluttered scenes: Object form is an emergent property that cannot be perceived locally but becomes only available once the whole object has been detected and segregated from the background. Thus we address the detection of objects and the assembling of their shape simultaneously. A dictionary of meaningful contours is obtained by clustering based on contour co-activation in all training images. We seek a joint, consistent placement of all contours in an image, since placing them independently from another is not reliable due to the emergence of shape. Therefore, the characteristic object shape is learned by discovering spatially consistent configurations of all dictionary contours using maximum margin multiple instance learning. During recognition, objects are detected and their shape is explained simultaneously by optimizing a single cost function. We demonstrate the benefit of our approach on standard shape benchmarks.

[1]  Sanja Fidler,et al.  Towards Scalable Representations of Object Categories: Learning a Hierarchy of Parts , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  B. Julesz Textons, the elements of texture perception, and their interactions , 1981, Nature.

[3]  Jeff A. Bilmes,et al.  A Submodular-supermodular Procedure with Applications to Discriminative Structure Learning , 2005, UAI.

[4]  Iasonas Kokkinos,et al.  HOP: Hierarchical object parsing , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Ben Taskar,et al.  Object detection via boundary structure segmentation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[7]  Long Zhu,et al.  Max Margin Learning of Hierarchical Configural Deformable Templates (HCDTs) for Efficient Object Parsing and Pose Estimation , 2011, International Journal of Computer Vision.

[8]  Jitendra Malik,et al.  Using contours to detect and localize junctions in natural images , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Andrew Blake,et al.  Multiscale Categorical Object Recognition Using Contour Fragments , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Rama Chellappa,et al.  Fast directional chamfer matching , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[11]  Hayko Riemenschneider,et al.  Using Partial Edge Contour Matches for Efficient Object Category Localization , 2010, ECCV.

[12]  Cristian Sminchisescu,et al.  Constrained parametric min-cuts for automatic object segmentation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[13]  Juergen Gall,et al.  Class-specific Hough forests for object detection , 2009, CVPR.

[14]  Jitendra Malik,et al.  Shape matching and object recognition using low distortion correspondences , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[15]  Longin Jan Latecki,et al.  From partial shape matching through local deformation to robust global shape similarity for object detection , 2011, CVPR 2011.

[16]  I. Biederman Recognition-by-components: a theory of human image understanding. , 1987, Psychological review.

[17]  Shimon Ullman,et al.  Combined Top-Down/Bottom-Up Segmentation , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Joachim M. Buhmann,et al.  Learning the Compositional Nature of Visual Object Categories for Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Zhuowen Tu,et al.  Image Parsing: Unifying Segmentation, Detection, and Recognition , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[20]  Thomas Hofmann,et al.  Support Vector Machines for Multiple-Instance Learning , 2002, NIPS.

[21]  Narendra Ahuja,et al.  Connected Segmentation Tree — A joint representation of region layout and hierarchy , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Andrew Zisserman,et al.  Incremental learning of object detectors using a visual shape alphabet , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[23]  Sven J. Dickinson,et al.  Contour Grouping and Abstraction Using Simple Part Models , 2010, ECCV.

[24]  Björn Ommer,et al.  Voting by Grouping Dependent Parts , 2010, ECCV.

[25]  Cordelia Schmid,et al.  Bandit Algorithms for Tree Search , 2007, UAI.

[26]  Jitendra Malik,et al.  Object detection using a max-margin Hough transform , 2009, CVPR.

[27]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Dariu Gavrila,et al.  A Bayesian, Exemplar-Based Approach to Hierarchical Shape Matching , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Jianbo Shi,et al.  Many-to-one contour matching for describing and discriminating object shape , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[30]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[31]  Pietro Perona,et al.  Object class recognition by unsupervised scale-invariant learning , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[32]  David A. McAllester,et al.  A discriminatively trained, multiscale, deformable part model , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Bernt Schiele,et al.  Robust Object Detection with Interleaved Categorization and Segmentation , 2008, International Journal of Computer Vision.

[34]  Jitendra Malik,et al.  Multi-scale object detection by clustering lines , 2009, 2009 IEEE 12th International Conference on Computer Vision.