Semantic part segmentation using compositional model combining shape and appearance

In this paper, we study the problem of semantic part segmentation for animals. This is more challenging than standard object detection, object segmentation and pose estimation tasks because semantic parts of animals often have similar appearance and highly varying shapes. To tackle these challenges, we build a mixture of compositional models to represent the object boundary and the boundaries of semantic parts. And we incorporate edge, appearance, and semantic part cues into the compositional model. Given part-level segmentation annotation, we develop a novel algorithm to learn a mixture of compositional models under various poses and viewpoints for certain animal classes. Furthermore, a linear complexity algorithm is offered for efficient inference of the compositional model using dynamic programming. We evaluate our method for horse and cow using a newly annotated dataset on Pascal VOC 2010 which has pixelwise part labels. Experimental results demonstrate the effectiveness of our method.

[1]  Jitendra Malik,et al.  Simultaneous Detection and Segmentation , 2014, ECCV.

[2]  Christopher K. I. Williams,et al.  A Generative Model for Parts-based Object Segmentation , 2012, NIPS.

[3]  Sanja Fidler,et al.  Bottom-Up Segmentation for Top-Down Detection , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  George Papandreou,et al.  Learning a Dictionary of Shape Epitomes with Applications to Image Labeling , 2013, 2013 IEEE International Conference on Computer Vision.

[5]  Jian Dong,et al.  A Deformable Mixture Parsing Model with Parselets , 2013, 2013 IEEE International Conference on Computer Vision.

[6]  Cordelia Schmid,et al.  Bandit Algorithms for Tree Search , 2007, UAI.

[7]  Charless C. Fowlkes,et al.  Shape-based pedestrian parsing , 2011, CVPR 2011.

[8]  Alan L. Yuille,et al.  Parsing Semantic Parts of Cars Using Graphical Models and Segment Appearance Consistency , 2014, BMVC.

[9]  David J. Kriegman,et al.  Localizing Parts of Faces Using a Consensus of Exemplars , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Deva Ramanan,et al.  Face detection, pose estimation, and landmark localization in the wild , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Sven J. Dickinson,et al.  Detecting Curved Symmetric Parts Using a Deformable Disc Model , 2013, 2013 IEEE International Conference on Computer Vision.

[12]  Sanja Fidler,et al.  The Role of Context for Object Detection and Semantic Segmentation in the Wild , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Christian Szegedy,et al.  DeepPose: Human Pose Estimation via Deep Neural Networks , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Luc Van Gool,et al.  Using Recognition to Guide a Robot's Attention , 2008, Robotics: Science and Systems.

[15]  Long Zhu,et al.  Learning a Hierarchical Deformable Template for Rapid Deformable Object Parsing , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Andrew Blake,et al.  Contour-based learning for object detection , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[17]  Jian Dong,et al.  Towards Unified Human Parsing and Pose Estimation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Luis E. Ortiz,et al.  Parsing clothing in fashion photographs , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Stuart Geman,et al.  Context and Hierarchy in a Probabilistic Image Model , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[20]  ZhuLong,et al.  Learning a Hierarchical Deformable Template for Rapid Deformable Object Parsing , 2010 .

[21]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Peter N. Belhumeur,et al.  Bird Part Localization Using Exemplar-Based Models with Enforced Pose and Subcategory Consistency , 2013, 2013 IEEE International Conference on Computer Vision.

[23]  Iasonas Kokkinos,et al.  Unsupervised Learning of Object Deformation Models , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[24]  Jitendra Malik,et al.  Semantic segmentation using regions and parts , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Long Zhu,et al.  Unsupervised Structure Learning: Hierarchical Recursive Composition, Suspicious Coincidence and Competitive Exclusion , 2008, ECCV.

[26]  L. Bottou,et al.  A Generative Model for Parts-based Object Segmentation , 2012 .

[27]  Song-Chun Zhu,et al.  Learning Active Basis Model for Object Detection and Recognition , 2010, International Journal of Computer Vision.

[28]  Daniel P. Huttenlocher,et al.  Distance Transforms of Sampled Functions , 2012, Theory Comput..

[29]  Yi Yang,et al.  Articulated pose estimation with flexible mixtures-of-parts , 2011, CVPR 2011.

[30]  Calvin C. Zhao Critical Review : Contour Detection and Hierarchical Image Segmentation , 2015 .

[31]  Long Zhu,et al.  Max Margin Learning of Hierarchical Configural Deformable Templates (HCDTs) for Efficient Object Parsing and Pose Estimation , 2011, International Journal of Computer Vision.

[32]  Iasonas Kokkinos,et al.  Inference and Learning with Hierarchical Shape Models , 2011, International Journal of Computer Vision.

[33]  Cristian Sminchisescu,et al.  Semantic Segmentation with Second-Order Pooling , 2012, ECCV.

[34]  Sanja Fidler,et al.  Detect What You Can: Detecting and Representing Objects Using Holistic Models and Body Parts , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[35]  Cristian Sminchisescu,et al.  CPMC: Automatic Object Segmentation Using Constrained Parametric Min-Cuts , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Sanja Fidler,et al.  Towards Scalable Representations of Object Categories: Learning a Hierarchy of Parts , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[37]  Daniel P. Huttenlocher,et al.  Pictorial Structures for Object Recognition , 2004, International Journal of Computer Vision.