论文信息 - Part and appearance sharing: Recursive Compositional Models for multi-view

Part and appearance sharing: Recursive Compositional Models for multi-view

We propose Recursive Compositional Models (RCMs) for simultaneous multi-view multi-object detection and parsing (e.g. view estimation and determining the positions of the object subparts). We represent the set of objects by a family of RCMs where each RCM is a probability distribution defined over a hierarchical graph which corresponds to a specific object and viewpoint. An RCM is constructed from a hierarchy of subparts/subgraphs which are learnt from training data. Part-sharing is used so that different RCMs are encouraged to share subparts/subgraphs which yields a compact representation for the set of objects and which enables efficient inference and learning from a limited number of training samples. In addition, we use appearance-sharing so that RCMs for the same object, but different viewpoints, share similar appearance cues which also helps efficient learning. RCMs lead to a multi-view multi-object detection system. We illustrate RCMs on four public datasets and achieve state-of-the-art performance.

[1] Yoram Singer,et al. Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers , 2000, J. Mach. Learn. Res..

[2] Shimon Ullman,et al. Class-Specific, Top-Down Segmentation , 2002, ECCV.

[3] Yali Amit,et al. A coarse-to-fine strategy for multiclass shape detection , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4] Bill Triggs,et al. Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[5] Cordelia Schmid,et al. The 2005 PASCAL Visual Object Classes Challenge , 2005, MLCW.

[6] Luc Van Gool,et al. Towards Multi-View Object Class Detection , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[7] Bernt Schiele,et al. Multiple Object Class Detection with a Generative Model , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[8] Antonio Criminisi,et al. TextonBoost: Joint Appearance, Shape and Context Modeling for Multi-class Object Recognition and Segmentation , 2006, ECCV.

[9] Antonio Torralba,et al. LabelMe: A Database and Web-Based Tool for Image Annotation , 2008, International Journal of Computer Vision.

[10] Sanja Fidler,et al. Towards Scalable Representations of Object Categories: Learning a Hierarchy of Parts , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[11] Silvio Savarese,et al. 3D generic object categorization, localization and pose estimation , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[12] Song-Chun Zhu,et al. Deformable Template As Active Basis , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[13] Antonio Torralba,et al. Sharing Visual Features for Multiclass and Multiview Object Detection , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14] Derek Hoiem,et al. 3D LayoutCRF for Multi-View Object Class Recognition and Segmentation , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[15] Zhuowen Tu,et al. Detecting Object Boundaries Using Low-, Mid-, and High-level Information , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[16] Andrew Blake,et al. Efficiently Combining Contour and Texture Cues for Object Recognition , 2008, BMVC.

[17] David A. McAllester,et al. A discriminatively trained, multiscale, deformable part model , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[18] Long Zhu,et al. Unsupervised Structure Learning: Hierarchical Recursive Composition, Suspicious Coincidence and Competitive Exclusion , 2008, ECCV.

[19] Zhuowen Tu,et al. Active skeleton for non-rigid object detection , 2009, 2009 IEEE 12th International Conference on Computer Vision.