A Shape Reconstructability Measure of Object Part Importance with Applications to Object Detection and Localization

We propose a computational model which computes the importance of 2-D object shape parts, and we apply it to detect and localize objects with and without occlusions. The importance of a shape part (a localized contour fragment) is considered from the perspective of its contribution to the perception and recognition of the global shape of the object. Accordingly, the part importance measure is defined based on the ability to estimate/recall the global shapes of objects from the local part, namely the part’s “shape reconstructability”. More precisely, the shape reconstructability of a part is determined by two factors–part variation and part uniqueness. (i) Part variation measures the precision of the global shape reconstruction, i.e. the consistency of the reconstructed global shape with the true object shape; and (ii) part uniqueness quantifies the ambiguity of matching the part to the object, i.e. taking into account that the part could be matched to the object at several different locations. Taking both these factors into consideration, an information theoretic formulation is proposed to measure part importance by the conditional entropy of the reconstruction of the object shape from the part. Experimental results demonstrate the benefit with the proposed part importance in object detection, including the improvement of detection rate, localization accuracy, and detection efficiency. By comparing with other state-of-the-art object detectors in a challenging but common scenario, object detection with occlusions, we show a considerable improvement using the proposed importance measure, with the detection rate increased over $$10~\%$$10%. On a subset of the challenging PASCAL dataset, the Interpolated Average Precision (as used in the PASCAL VOC challenge) is improved by 4–8 %. Moreover, we perform a psychological experiment which provides evidence suggesting that humans use a similar measure for part importance when perceiving and recognizing shapes.

[1]  Hayko Riemenschneider,et al.  Using Partial Edge Contour Matches for Efficient Object Category Localization , 2010, ECCV.

[2]  Donald D. Hoffman,et al.  Parts of recognition , 1984, Cognition.

[3]  S. Ullman Object recognition and segmentation by a fragment-based hierarchy , 2007, Trends in Cognitive Sciences.

[4]  Hongyang Chao,et al.  Learning Shape Detector by Quantizing Curve Segments with Multiple Distance Metrics , 2010, ECCV.

[5]  Frédéric Jurie,et al.  Groups of Adjacent Contour Segments for Object Detection , 2008, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Longin Jan Latecki,et al.  Shape guided contour grouping with particle filters , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[7]  Sven J. Dickinson,et al.  Contour Grouping and Abstraction Using Simple Part Models , 2010, ECCV.

[8]  Benjamin B. Kimia,et al.  Euler Spiral for Shape Completion , 2003, International Journal of Computer Vision.

[9]  Jianbo Shi,et al.  Contour Context Selection for Object Detection: A Set-to-Set Contour Matching Approach , 2008, ECCV.

[10]  Jian-Huang Lai,et al.  Learning contour-fragment-based shape model with And-Or tree representation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Björn Ommer,et al.  Voting by Grouping Dependent Parts , 2010, ECCV.

[12]  Cordelia Schmid,et al.  Bandit Algorithms for Tree Search , 2007, UAI.

[13]  Preeti Verghese,et al.  Where to look next? Eye movements reduce local uncertainty. , 2007, Journal of vision.

[14]  Andrew Zisserman,et al.  Learning an Alphabet of Shape and Appearance for Multi-Class Object Detection , 2008, International Journal of Computer Vision.

[15]  Timothy F. Cootes,et al.  Active Shape Models-Their Training and Application , 1995, Comput. Vis. Image Underst..

[16]  Ronald A. Rensink,et al.  Early completion of occluded objects , 1998, Vision Research.

[17]  William T. Freeman,et al.  Latent hierarchical structural learning for object detection , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[18]  I. Biederman,et al.  Priming contour-deleted images: Evidence for intermediate representations in visual object recognition , 1991, Cognitive Psychology.

[19]  CipollaRoberto,et al.  Multiscale Categorical Object Recognition Using Contour Fragments , 2008 .

[20]  Daniel P. Huttenlocher,et al.  Weakly Supervised Learning of Part-Based Spatial Models for Visual Object Recognition , 2006, ECCV.

[21]  Cordelia Schmid,et al.  Scale-invariant shape features for recognition of object categories , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[22]  Wenyu Liu,et al.  Fan Shape Model for object detection , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Björn Ommer,et al.  From Meaningful Contours to Discriminative Object Shape , 2012, ECCV.

[24]  Hongping Cai,et al.  Learning weights for codebook in image classification and retrieval , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[25]  Wenyu Liu,et al.  Convex shape decomposition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[26]  David A. McAllester,et al.  Cascade object detection with deformable part models , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[27]  Laurent D. Cohen,et al.  Matching 2D and 3D articulated shapes using the eccentricity transform , 2011, Comput. Vis. Image Underst..

[28]  Ben Taskar,et al.  Shape-Based Object Detection via Boundary Structure Segmentation , 2012, International Journal of Computer Vision.

[29]  Mongi A. Abidi,et al.  Shape Measure for Identifying Perceptually Informative Parts of 3D Objects , 2006, Third International Symposium on 3D Data Processing, Visualization, and Transmission (3DPVT'06).

[30]  I. Biederman Recognition-by-components: a theory of human image understanding. , 1987, Psychological review.

[31]  David G. Lowe,et al.  University of British Columbia. , 1945, Canadian Medical Association journal.

[32]  Walter Gerbino,et al.  Amodal completion: Seeing or thinking? , 1982 .

[33]  Longin Jan Latecki,et al.  Path Similarity Skeleton Graph Matching , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Andrew Blake,et al.  Multiscale Categorical Object Recognition Using Contour Fragments , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  Bernt Schiele,et al.  Automatic discovery of meaningful object parts with latent CRFs , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[36]  K Siddiqi,et al.  Parts of Visual Form: Psychophysical Aspects , 1996, Perception.

[37]  Longin Jan Latecki,et al.  From partial shape matching through local deformation to robust global shape similarity for object detection , 2011, CVPR 2011.

[38]  Eli Brenner,et al.  Flexibility in intercepting moving objects. , 2007, Journal of vision.

[39]  Jianbo Shi,et al.  Many-to-one contour matching for describing and discriminating object shape , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[40]  Jitendra Malik,et al.  Learning to detect natural image boundaries using local brightness, color, and texture cues , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Jitendra Malik,et al.  Shape matching and object recognition using shape contexts , 2010, 2010 3rd International Conference on Computer Science and Information Technology.

[42]  D. Hubel,et al.  Receptive fields, binocular interaction and functional architecture in the cat's visual cortex , 1962, The Journal of physiology.

[43]  Guillaume Bouchard,et al.  Hierarchical part-based visual object categorization , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[44]  Anurag Mittal,et al.  Multi-stage Contour Based Detection of Deformable Objects , 2008, ECCV.

[45]  Song-Chun Zhu,et al.  A multi-scale generative model for animate shapes and parts , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[46]  Michael J. Black,et al.  Contour people: A parameterized model of 2D articulated human shape , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[47]  Larry A. Rendell,et al.  A Practical Approach to Feature Selection , 1992, ML.

[48]  Jitendra Malik,et al.  Object detection using a max-margin Hough transform , 2009, CVPR.

[49]  Donald D. Hoffman,et al.  Salience of visual parts , 1997, Cognition.

[50]  G. Bower,et al.  Structural units and the redintegrative power of picture fragments. , 1976, Journal of experimental psychology. Human learning and memory.

[51]  Luc Van Gool,et al.  Object Detection by Contour Segment Networks , 2006, ECCV.

[52]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[53]  Cordelia Schmid,et al.  Human Detection Based on a Probabilistic Assembly of Robust Part Detectors , 2004, ECCV.

[54]  Simon Baker,et al.  Active Appearance Models Revisited , 2004, International Journal of Computer Vision.

[55]  Yoram Singer,et al.  Efficient projections onto the l1-ball for learning in high dimensions , 2008, ICML '08.

[56]  IEEE conference on computer vision and pattern recognition , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[57]  Rama Chellappa,et al.  Articulation-Invariant Representation of Non-planar Shapes , 2010, ECCV.

[58]  Shimon Ullman,et al.  Semantic Hierarchies for Recognizing Objects and Parts , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[59]  Anand Rangarajan,et al.  A new point matching algorithm for non-rigid registration , 2003, Comput. Vis. Image Underst..

[60]  A. Yuille,et al.  Object perception as Bayesian inference. , 2004, Annual review of psychology.

[61]  Song-Chun Zhu,et al.  Filters, Random Fields and Maximum Entropy (FRAME): Towards a Unified Theory for Texture Modeling , 1998, International Journal of Computer Vision.

[62]  Daniel P. Huttenlocher,et al.  Pictorial Structures for Object Recognition , 2004, International Journal of Computer Vision.

[63]  Takeo Kanade,et al.  Object Detection Using the Statistics of Parts , 2004, International Journal of Computer Vision.

[64]  Jitendra Malik,et al.  Multi-scale object detection by clustering lines , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[65]  Benjamin B. Kimia,et al.  Symmetry-Based Indexing of Image Databases , 1998, J. Vis. Commun. Image Represent..

[66]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.