Markerless human motion transfer

We develop a computer vision-based system to transfer human motion from one subject to another. Our system uses a network of eight calibrated and synchronized cameras. We first build detailed kinematic models of the subjects based on our algorithms for extracting shape from silhouette across time (G. Cheung et al., 2003). These models are then used to capture the motion (joint angles) of the subjects in new video sequences. Finally we describe an image-based rendering algorithm to render the captured motion applied to the articulated model of another person. Our rendering algorithm uses an ensemble of spatially and temporally distributed images to generate photo-realistic video of the transferred motion. We demonstrate the performance of the system by rendering throwing and kungfu motions on subjects who did not perform them.

[1]  Takeo Kanade,et al.  Shape-from-silhouette of articulated objects and its use for human body kinematics estimation and motion capture , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[2]  Hans-Peter Seidel,et al.  Free-viewpoint video of human actors , 2003, ACM Trans. Graph..

[3]  Ioannis A. Kakadiaris,et al.  3D human body model acquisition from multiple views , 1995, Proceedings of IEEE International Conference on Computer Vision.

[4]  Takeo Kanade,et al.  Visual hull alignment and refinement across time: a 3D reconstruction algorithm combining shape-from-silhouette with stereo , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[5]  Mohan M. Trivedi,et al.  Human Body Model Acquisition and Tracking Using Voxel Data , 2003, International Journal of Computer Vision.

[6]  Narendra Ahuja,et al.  Generating Octrees from Object Silhouettes in Orthographic Views , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Simon Baker,et al.  Visual hull construction, alignment and refinement for human kinematic modeling, motion tracking and rendering , 2003 .

[8]  Jitendra Malik,et al.  Modeling and Rendering Architecture from Photographs: A hybrid geometry- and image-based approach , 1996, SIGGRAPH.

[9]  Donald P. Greenberg,et al.  Improved Computational Methods for Ray Tracing , 1984, TOGS.

[10]  Thomas B. Moeslund,et al.  A Survey of Computer Vision-Based Human Motion Capture , 2001, Comput. Vis. Image Underst..

[11]  James F. Blinn,et al.  Me and My (Fake) Shadow , 1988 .

[12]  V. Leitáo,et al.  Computer Graphics: Principles and Practice , 1995 .

[13]  William E. Lorensen,et al.  Marching cubes: A high resolution 3D surface construction algorithm , 1987, SIGGRAPH.

[14]  Michael J. Black,et al.  A framework for modeling the appearance of 3D articulated figures , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[15]  Martial Hebert,et al.  Control of Polygonal Mesh Resolution for 3-D Computer Vision , 1998, Graph. Model. Image Process..

[16]  Takeo Kanade,et al.  Spatio-Temporal View Interpolation , 2002, Rendering Techniques.

[17]  Olivier D. Faugeras,et al.  3D articulated models and multi-view tracking with silhouettes , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[18]  Takeo Kanade,et al.  Three-dimensional scene flow , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.