The stitched puppet: A graphical model of 3D human shape and pose

We propose a new 3D model of the human body that is both realistic and part-based. The body is represented by a graphical model in which nodes of the graph correspond to body parts that can independently translate and rotate in 3D and deform to represent different body shapes and to capture pose-dependent shape variations. Pairwise potentials define a “stitching cost” for pulling the limbs apart, giving rise to the stitched puppet (SP) model. Unlike existing realistic 3D body models, the distributed representation facilitates inference by allowing the model to more effectively explore the space of poses, much like existing 2D pictorial structures models. We infer pose and body shape using a form of particle-based max-product belief propagation. This gives SP the realism of recent 3D body models with the computational advantages of part-based models. We apply SP to two challenging problems involving estimating human shape and pose from 3D data. The first is the FAUST mesh alignment challenge, where ours is the first method to successfully align all 3D meshes with no pose prior. The second involves estimating pose and shape from crude visual hull representations of complex body movements.

[1]  Hans-Peter Seidel,et al.  A Statistical Model of Human Pose and Body Shape , 2009, Comput. Graph. Forum.

[2]  Michael Isard,et al.  Loose-limbed People: Estimating 3D Human Pose and Motion Using Non-parametric Belief Propagation , 2011, International Journal of Computer Vision.

[3]  Bodo Rosenhahn,et al.  Model-Based Pose Estimation , 2011, Visual Analysis of Humans.

[4]  Alex Pentland,et al.  Recovery of Nonrigid Motion and Structure , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Michael J. Black,et al.  FAUST: Dataset and Evaluation for 3D Mesh Registration , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Stefan Carlsson,et al.  3D Pictorial Structures for Multiple View Articulated Pose Estimation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Bodo Rosenhahn,et al.  Multisensor-fusion for 3D full-body human motion capture , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[8]  Michael J. Black,et al.  MoSh: motion and shape capture from sparse markers , 2014, ACM Trans. Graph..

[9]  T. Funkhouser,et al.  Möbius voting for surface correspondence , 2009, SIGGRAPH 2009.

[10]  Ghassan Hamarneh,et al.  A Survey on Shape Correspondence , 2011, Comput. Graph. Forum.

[11]  Dragomir Anguelov,et al.  Object Pose Detection in Range Scan Data , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[12]  Sebastian Thrun,et al.  SCAPE: shape completion and animation of people , 2005, SIGGRAPH 2005.

[13]  C. Cobelli,et al.  A Markerless Motion Capture System to Study Musculoskeletal Biomechanics: Visual Hull and Simulated Annealing Approach , 2006, Annals of Biomedical Engineering.

[14]  Daniel P. Huttenlocher,et al.  Pictorial Structures for Object Recognition , 2004, International Journal of Computer Vision.

[15]  PetlandAlex,et al.  Recovery of Nonrigid Motion and Structure , 1991 .

[16]  Cristian Sminchisescu,et al.  Human Pose Estimation from Silhouettes - A Consistent Approach Using Distance Level Sets , 2002, WSCG.

[17]  Michael Isard,et al.  Attractive People: Assembling Loose-Limbed Models using Non-parametric Belief Propagation , 2003, NIPS.

[18]  Vladimir G. Kim,et al.  Blended intrinsic maps , 2011, SIGGRAPH 2011.

[19]  Bernt Schiele,et al.  Pictorial structures revisited: People detection and articulated pose estimation , 2009, CVPR.

[20]  Hans-Peter Seidel,et al.  Fast articulated motion tracking using a sums of Gaussians body model , 2011, 2011 International Conference on Computer Vision.

[21]  Zoran Popovic,et al.  The space of human body shapes: reconstruction and parameterization from range scans , 2003, ACM Trans. Graph..

[22]  Chang Shu,et al.  Human shape correspondence with automatically predicted landmarks , 2011, Machine Vision and Applications.

[23]  Nassir Navab,et al.  3D Pictorial Structures for Multiple Human Pose Estimation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Björn Stenger,et al.  Human Body Shape Estimation Using a Multi-resolution Manifold Forest , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  David J. Fleet,et al.  Stochastic Tracking of 3D Human Figures Using 2D Image Motion , 2000, ECCV.

[26]  Zicheng Liu,et al.  Tensor-Based Human Body Modeling , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  R. Plankers,et al.  Articulated soft objects for video-based body modeling , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[28]  Ian D. Reid,et al.  Articulated Body Motion Capture by Stochastic Search , 2005, International Journal of Computer Vision.

[29]  D. Marr,et al.  Representation and recognition of the spatial organization of three-dimensional shapes , 1978, Proceedings of the Royal Society of London. Series B. Biological Sciences.

[30]  Michael J. Black,et al.  From Pictorial Structures to deformable structures , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Paul J. Besl,et al.  A Method for Registration of 3-D Shapes , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[32]  Michael J. Black,et al.  Preserving Modes and Messages via Diverse Particle Selection , 2014, ICML.

[33]  Michael J. Black,et al.  Detailed Human Shape and Pose from Images , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Michael J. Black,et al.  Home 3D body scans from noisy image and range data , 2011, 2011 International Conference on Computer Vision.

[35]  Bernt Schiele,et al.  Multi-view Pictorial Structures for 3D Human Pose Estimation , 2013, BMVC.

[36]  David J. Fleet,et al.  Stochastic Tracking of 3 D Human Figures Using 2 D Image Motion , 2000 .

[37]  Zoran Popovic,et al.  Articulated body deformation from range scan data , 2002, SIGGRAPH.

[38]  Jitendra Malik,et al.  Tracking people with twists and exponential maps , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[39]  Ramakant Nevatia,et al.  Structured Descriptions of Complex Objects , 1973, IJCAI.

[40]  Hans-Peter Seidel,et al.  Motion capture using joint skeleton tracking and surface estimation , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[41]  Michael J. Black,et al.  Estimating human shape and pose from a single image , 2009, 2009 IEEE 12th International Conference on Computer Vision.