Parametric Image Segmentation of Humans with Structural Shape Priors

The figure-ground segmentation of humans in images captured in natural environments is an outstanding open problem due to the presence of complex backgrounds, articulation, varying body proportions, partial views and viewpoint changes. In this work we propose class-specific segmentation models that leverage parametric max-flow image segmentation and a large dataset of human shapes. Our contributions are as follows: (1) formulation of a sub-modular energy model that combines class-specific structural constraints and data-driven shape priors, within a parametric max-flow optimization methodology that systematically computes all breakpoints of the model in polynomial time; (2) design of a data-driven class-specific fusion methodology, based on matching against a large training set of exemplar human shapes (100,000 in our experiments), that allows the shape prior to be constructed on-the-fly, for arbitrary viewpoints and partial views.

[1]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[2]  Bernt Schiele,et al.  Robust Object Detection with Interleaved Categorization and Segmentation , 2008, International Journal of Computer Vision.

[3]  Jitendra Malik,et al.  Poselets: Body part detectors trained using 3D human pose annotations , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[4]  Yi Yang,et al.  Articulated Human Detection with Flexible Mixtures of Parts , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Cristian Sminchisescu,et al.  Semantic Segmentation with Second-Order Pooling , 2012, ECCV.

[7]  Jian Dong,et al.  Towards Unified Object Detection and Semantic Segmentation , 2014, ECCV.

[8]  Jitendra Malik,et al.  Using contours to detect and localize junctions in natural images , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Koen E. A. van de Sande,et al.  Segmentation as selective search for object recognition , 2011, 2011 International Conference on Computer Vision.

[10]  Yi Yang,et al.  Parsing Occluded People , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  V. S. Stognienko,et al.  A new test for randomness and its application to some cryptographic problems , 2004 .

[12]  Subhransu Maji,et al.  Detecting People Using Mutually Consistent Poselet Activations , 2010, ECCV.

[13]  Jitendra Malik,et al.  Shape matching and object recognition using shape contexts , 2010, 2010 3rd International Conference on Computer Science and Information Technology.

[14]  Jitendra Malik,et al.  Multi-component Models for Object Detection , 2012, ECCV.

[15]  Shimon Ullman,et al.  Class-Specific, Top-Down Segmentation , 2002, ECCV.

[16]  Cristian Sminchisescu,et al.  Iterated Second-Order Label Sensitive Pooling for 3D Human Pose Estimation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Kristen Grauman,et al.  Shape Sharing for Object Segmentation , 2012, ECCV.

[18]  Andrew Zisserman,et al.  Pose search: Retrieving people using their pose , 2009, CVPR 2009.

[19]  Bernt Schiele,et al.  Pictorial structures revisited: People detection and articulated pose estimation , 2009, CVPR.

[20]  Daphne Koller,et al.  Multi-level inference by relaxed dual decomposition for human pose segmentation , 2011, CVPR 2011.

[21]  Derek Hoiem,et al.  Category Independent Object Proposals , 2010, ECCV.

[22]  Andrew Zisserman,et al.  OBJCUT: Efficient Segmentation Using Top-Down and Bottom-Up Cues , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Trevor Darrell,et al.  Sparse probabilistic regression for activity-independent human pose inference , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Cristian Sminchisescu,et al.  Efficient Closed-Form Solution to Generalized Boundary Detection , 2012, ECCV.

[25]  Anat Levin,et al.  Learning to Combine Bottom-Up and Top-Down Segmentation , 2006, ECCV.

[26]  Ronen Basri,et al.  Image Segmentation by Probabilistic Bottom-Up Aggregation and Cue Integration , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Andrew Zisserman,et al.  Pose search: Retrieving people using their pose , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Pietro Perona,et al.  Object detection and segmentation from joint embedding of parts and pixels , 2011, 2011 International Conference on Computer Vision.

[29]  Michael J. Black,et al.  Measure Locally, Reason Globally: Occlusion-sensitive Articulated Pose Estimation , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[30]  Vladimir Kolmogorov,et al.  Applications of parametric maxflow in computer vision , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[31]  Alexei A. Efros,et al.  Improving Spatial Support for Objects via Multiple Segmentations , 2007, BMVC.

[32]  Cordelia Schmid,et al.  Estimating Human Pose with Flowing Puppets , 2013, 2013 IEEE International Conference on Computer Vision.

[33]  Andrew Zisserman,et al.  Human Pose Estimation Using a Joint Pixel-wise and Part-wise Formulation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Peter V. Gehler,et al.  Poselet Conditioned Pictorial Structures , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[35]  Guosheng Lin,et al.  Efficient Piecewise Training of Deep Structured Models for Semantic Segmentation , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Pushmeet Kohli,et al.  PoseCut: Simultaneous Segmentation and 3D Pose Estimation of Humans Using Dynamic Graph-Cuts , 2006, ECCV.

[37]  Vittorio Ferrari,et al.  Figure-ground segmentation by transferring window masks , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[38]  Sven J. Dickinson,et al.  Optimal Contour Closure by Superpixel Grouping , 2010, ECCV.

[39]  Alexei A. Efros,et al.  Segmenting Scenes by Matching Image Composites , 2009, NIPS.

[40]  Amir Rosenfeld,et al.  Extracting foreground masks towards object recognition , 2011, 2011 International Conference on Computer Vision.

[41]  Andrew Blake,et al.  Image Segmentation by Branch-and-Mincut , 2008, ECCV.

[42]  Cristian Sminchisescu,et al.  Human3.6M: Large Scale Datasets and Predictive Methods for 3D Human Sensing in Natural Environments , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Cristian Sminchisescu,et al.  Latent structured models for human pose estimation , 2011, 2011 International Conference on Computer Vision.

[44]  Cristian Sminchisescu,et al.  CPMC: Automatic Object Segmentation Using Constrained Parametric Min-Cuts , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[45]  Subhransu Maji,et al.  Object segmentation by alignment of poselet activations to image contours , 2011, CVPR 2011.

[46]  Jitendra Malik,et al.  Learning a classification model for segmentation , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[47]  Bernt Schiele,et al.  2D Human Pose Estimation: New Benchmark and State of the Art Analysis , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[48]  Iasonas Kokkinos,et al.  Fast and Exact: ADMM-Based Discriminative Shape Segmentation with Loopy Part Models , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[49]  Dariu Gavrila,et al.  PedCut: an iterative framework for pedestrian segmentation combining shape models and multiple data cues , 2013, BMVC.

[50]  Loong Fah Cheong,et al.  Segmentation over Detection by Coupled Global and Local Sparse Representations , 2012, ECCV.

[51]  Michael J. Black,et al.  From Pictorial Structures to deformable structures , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[52]  Robert E. Tarjan,et al.  A Fast Parametric Maximum Flow Algorithm and Applications , 1989, SIAM J. Comput..

[53]  Charless C. Fowlkes,et al.  Contour Detection and Hierarchical Image Segmentation , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.