VideoMocap: modeling physically realistic human motion from monocular video sequences

This paper presents a video-based motion modeling technique for capturing physically realistic human motion from monocular video sequences. We formulate the video-based motion modeling process in an image-based keyframe animation framework. The system first computes camera parameters, human skeletal size, and a small number of 3D key poses from video and then uses 2D image measurements at intermediate frames to automatically calculate the "in between" poses. During reconstruction, we leverage Newtonian physics, contact constraints, and 2D image measurements to simultaneously reconstruct full-body poses, joint torques, and contact forces. We have demonstrated the power and effectiveness of our system by generating a wide variety of physically realistic human actions from uncalibrated monocular video sequences such as sports video footage.

[1]  Ankur Agarwal,et al.  Recovering 3D human pose from monocular images , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  C. K. Liu,et al.  Learning physics-based motion style with nonlinear inverse optimization , 2005, SIGGRAPH 2005.

[3]  Zoran Popovic,et al.  Physically based motion transformation , 1999, SIGGRAPH.

[4]  Camillo J. Taylor,et al.  Reconstruction of Articulated Objects from Point Correspondences in a Single Uncalibrated Image , 2000, Comput. Vis. Image Underst..

[5]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Michael F. Cohen,et al.  Interactive spacetime control for animation , 1992, SIGGRAPH.

[7]  Michael J. Black,et al.  Implicit Probabilistic Models of Human Motion for Synthesis and Tracking , 2002, ECCV.

[8]  Odest Chadwicke Jenkins,et al.  Physical simulation for probabilistic motion tracking , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Andrew P. Witkin,et al.  Spacetime constraints , 1988, SIGGRAPH.

[10]  Cristian Sminchisescu,et al.  Generative modeling for continuous non-linearly embedded visual inference , 2004, ICML.

[11]  Jinxiang Chai,et al.  Interactive Tracking of 2D Generic Objects with Spacetime Optimization , 2008, ECCV.

[12]  Vladimir Pavlovic,et al.  Learning Switching Linear Models of Human Motion , 2000, NIPS.

[13]  Stefan Carlsson,et al.  Monocular 3D Reconstruction of Human Motion in Long Action Sequences , 2004, ECCV.

[14]  James M. Rehg,et al.  Reconstruction of 3D figure motion from 2D correspondences , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[15]  Jessica K. Hodgins,et al.  Performance animation from low-dimensional control signals , 2005, SIGGRAPH 2005.

[16]  Cristian Sminchisescu,et al.  BM³E : Discriminative Density Propagation for Visual Tracking , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  David J. Fleet,et al.  The Kneed Walker for human pose tracking , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Jessica K. Hodgins,et al.  Synthesizing physically realistic human motion in low-dimensional, behavior-specific spaces , 2004, SIGGRAPH 2004.

[19]  Werner A. Stahel,et al.  Robust Statistics: The Approach Based on Influence Functions , 1987 .

[20]  Rómer Rosales,et al.  Specialized mappings and the estimation of human body pose from a single image , 2000, Proceedings Workshop on Human Motion.

[21]  Jitendra Malik,et al.  Twist Based Acquisition and Tracking of Animal and Human Kinematics , 2004, International Journal of Computer Vision.

[22]  Yen-Lin Chen,et al.  3D Reconstruction of Human Motion and Skeleton from Uncalibrated Monocular Video , 2009, ACCV.

[23]  Jovan Popovic,et al.  Adaptation of performed ballistic motion , 2005, TOGS.

[24]  David J. Fleet,et al.  Priors for people tracking from small training sets , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[25]  Jinxiang Chai,et al.  Modeling 3D human poses from uncalibrated monocular images , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[26]  Peter J. Huber,et al.  Robust Statistics , 2005, Wiley Series in Probability and Statistics.

[27]  William T. Freeman,et al.  Bayesian Reconstruction of 3D Human Motion from Single-Camera Video , 1999, NIPS.

[28]  A. Elgammal,et al.  Inferring 3D body pose from silhouettes using activity manifold learning , 2004, CVPR 2004.