Shape-from-silhouette of articulated objects and its use for human body kinematics estimation and motion capture

Shape-from-silhouette (SFS), also known as visual hull (VH) construction, is a popular 3D reconstruction method, which estimates the shape of an object from multiple silhouette images. The original SFS formulation assumes that the entire silhouette images are captured either at the same time or while the object is static. This assumption is violated when the object moves or changes shape. Hence the use of SFS with moving objects has been restricted to treating each time instant sequentially and independently. Recently we have successfully extended the traditional SFS formulation to refine the shape of a rigidly moving object over time. We further extend SFS to apply to dynamic articulated objects. Given silhouettes of a moving articulated object, the process of recovering the shape and motion requires two steps: (1) correctly segmenting (points on the boundary of) the silhouettes to each articulated part of the object, (2) estimating the motion of each individual part using the segmented silhouette. In this paper, we propose an iterative algorithm to solve this simultaneous assignment and alignment problem. Once we have estimated the shape and motion of each part of the object, the articulation points between each pair of rigid parts are obtained by solving a simple motion constraint between the connected parts. To validate our algorithm, we first apply it to segment the different body parts and estimate the joint positions of a person. The acquired kinematic (shape and joint) information is then used to track the motion of the person in new video sequences.

[1]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[2]  Michael Potmesil Generating octree models of 3D objects from their silhouettes in a sequence of images , 1987, Comput. Vis. Graph. Image Process..

[3]  Aldo Laurentini,et al.  The visual hull: a new tool for contour-based image understanding , 1991 .

[4]  Richard Szeliski,et al.  Rapid octree construction from image sequences , 1993 .

[5]  A. Laurentini,et al.  The Visual Hull Concept for Silhouette-Based Image Understanding , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Richard Szeliski,et al.  Image mosaicing for tele-reality applications , 1994, Proceedings of 1994 IEEE Workshop on Applications of Computer Vision.

[7]  Ioannis A. Kakadiaris,et al.  3D human body model acquisition from multiple views , 1995, Proceedings of IEEE International Conference on Computer Vision.

[8]  David J. Kriegman,et al.  Structure and motion of curved 3D objects from monocular silhouettes , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[9]  Harpreet S. Sawhney,et al.  Compact Representations of Videos Through Dominant and Multiple Motion Estimation , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Larry S. Davis,et al.  3-D model-based tracking of humans in action: a multi-view approach , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[11]  L. Davis,et al.  el-based tracking of humans in action: , 1996 .

[12]  C. Bregler,et al.  Video Motion Capture , 1997 .

[13]  Larry S. Davis,et al.  W/sup 4/: Who? When? Where? What? A real time system for detecting and tracking people , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[14]  Olivier D. Faugeras,et al.  3D articulated models and multi-view tracking with silhouettes , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[15]  Wojciech Matusik,et al.  Creating and Rendering Image-Based Visual Hulls , 1999 .

[16]  Takeo Kanade,et al.  A real time system for robust 3D voxel reconstruction of human motions , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[17]  David J. Fleet,et al.  Stochastic Tracking of 3D Human Figures Using 2D Image Motion , 2000, ECCV.

[18]  Andrew Blake,et al.  Articulated body motion capture by annealed particle filtering , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[19]  David J. Fleet,et al.  Stochastic Tracking of 3D Human Figures Using 2D Image Motion , 2000, ECCV.

[20]  James M. Rehg,et al.  Reconstruction of 3-D Figure Motion from 2-D Correspondences , 2001, CVPR 2001.

[21]  James M. Rehg,et al.  Reconstruction of 3D figure motion from 2D correspondences , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[22]  G. Cheung Visual Hull Construction, Alignment and Refinement Across Time , 2001 .

[23]  Roberto Cipolla,et al.  Structure and motion from silhouettes , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[24]  William H. Press,et al.  Numerical recipes in C , 2002 .

[25]  Takeo Kanade,et al.  Visual hull alignment and refinement across time: a 3D reconstruction algorithm combining shape-from-silhouette with stereo , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..