Motion capture using joint skeleton tracking and surface estimation

This paper proposes a method for capturing the performance of a human or an animal from a multi-view video sequence. Given an articulated template model and silhouettes from a multi-view image sequence, our approach recovers not only the movement of the skeleton, but also the possibly non-rigid temporal deformation of the 3D surface. While large scale deformations or fast movements are captured by the skeleton pose and approximate surface skinning, true small scale deformations or non-rigid garment motion are captured by fitting the surface to the silhouette. We further propose a novel optimization scheme for skeleton-based pose estimation that exploits the skeleton's tree structure to split the optimization problem into a local one and a lower dimensional global one. We show on various sequences that our approach can capture the 3D motion of animals and humans accurately even in the case of rapid movements and wide apparel like skirts.

[1]  Jovan Popovic,et al.  Automatic rigging and animation of 3D characters , 2007, ACM Trans. Graph..

[2]  Hans-Peter Seidel,et al.  Interacting and Annealing Particle Filters: Mathematics and a Recipe for Applications , 2007, Journal of Mathematical Imaging and Vision.

[3]  Luca Ballan,et al.  Marker-less motion capture of skinned models in a four camera set-up using optical flow and silhouettes , 2008 .

[4]  Pascal Fua,et al.  Articulated Soft Objects for Multiview Shape and Motion Capture , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Jirí Zára,et al.  Geometric skinning with approximate dual quaternion blending , 2008, TOGS.

[6]  Olga Sorkine-Hornung,et al.  On Linear Variational Surface Deformation Methods , 2008, IEEE Transactions on Visualization and Computer Graphics.

[7]  Hans-Peter Seidel,et al.  Marker-less Deformable Mesh Tracking for Human Shape and Motion Capture , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Luc Van Gool,et al.  Full body tracking from multiple views using stochastic sampling , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[9]  Hans-Peter Seidel,et al.  A system for articulated tracking incorporating a clothing model , 2007, Machine Vision and Applications.

[10]  Jitendra Malik,et al.  Twist Based Acquisition and Tracking of Animal and Human Kinematics , 2004, International Journal of Computer Vision.

[11]  Michael J. Black,et al.  HumanEva: Synchronized Video and Motion Capture Dataset for Evaluation of Articulated Human Motion , 2006 .

[12]  Michael J. Black,et al.  The Naked Truth: Estimating Body Shape Under Clothing , 2008, ECCV.

[13]  Adrian Hilton,et al.  A survey of advances in vision-based human motion capture and analysis , 2006, Comput. Vis. Image Underst..

[14]  Hans-Peter Seidel,et al.  Performance capture from sparse multi-view video , 2008, ACM Trans. Graph..

[15]  Michael J. Black,et al.  Detailed Human Shape and Pose from Images , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Juergen Gall,et al.  International Journal of Computer Vision manuscript No. (will be inserted by the editor) Optimization and Filtering for Human Motion Capture A Multi-layer Framework , 2022 .

[17]  Larry S. Davis,et al.  3-D model-based tracking of humans in action: a multi-view approach , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[18]  C. Cobelli,et al.  A Markerless Motion Capture System to Study Musculoskeletal Biomechanics: Visual Hull and Simulated Annealing Approach , 2006, Annals of Biomedical Engineering.

[19]  Takeo Kanade,et al.  Shape-From-Silhouette Across Time Part II: Applications to Human Modeling and Markerless Motion Tracking , 2005, International Journal of Computer Vision.

[20]  Cristian Sminchisescu,et al.  Estimating Articulated Human Motion with Covariance Scaled Sampling , 2003, Int. J. Robotics Res..

[21]  Roberto Cipolla,et al.  Real-time tracking of highly articulated structures in the presence of noisy measurements , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[22]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[23]  Ian D. Reid,et al.  Articulated Body Motion Capture by Stochastic Search , 2005, International Journal of Computer Vision.

[24]  Wojciech Matusik,et al.  Articulated mesh animation from multi-view silhouettes , 2008, ACM Trans. Graph..

[25]  Sebastian Thrun,et al.  SCAPE: shape completion and animation of people , 2005, SIGGRAPH '05.

[26]  Radu Horaud,et al.  Temporal Surface Tracking Using Mesh Evolution , 2008, ECCV.

[27]  John F. Hughes Oriented Projective Geometry: A Framework for Geometric Computations (Jorge Stolfi) , 1993, SIAM Rev..

[28]  Adrian Hilton,et al.  Model-based multiple view reconstruction of people , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.