Combining 2d Feature Tracking And Volume Reconstruction For Online Video-Based Human Motion Capture

The acquisition of human motion data is of major importance for creating interactive virtual environments, intelligent user interfaces, and realistic computer animations. Today's performance of off-the-shelf computer hardware enables marker-free non-intrusive optical tracking of the human body. In addition, recent research shows that it is possible to efficiently acquire and render volumetric scene representations in real-time. This paper describes a system to capture human motion without the use of markers or scene-intruding devices. Instead, a 2D feature tracking algorithm and a silhouette-based 3D volumetric scene reconstruction method are applied directly to the image data. A person is recorded by multiple synchronized cameras, and a multi-layer hierarchical kinematic skeleton is fitted to each frame in a two-stage process. The pose of a first model layer at every time step is determined from the tracked 3D locations of hands, head and feet. A more sophisticated second skeleton layer is fitted to the motion data by applying a volume registration technique. We present results with a prototype system showing that the approach is capable of running at interactive frame rates.

[1]  A. Garrod Animal Locomotion , 1874, Nature.

[2]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[3]  G. Johansson Visual perception of biological motion and a model for its analysis , 1973 .

[4]  Peter Forbes Rowat,et al.  Representing spatial experience and solving spatial problems in a simulated robot environment , 1979 .

[5]  David C. Hogg Model-based vision: a program to see a walking person , 1983, Image Vis. Comput..

[6]  R. Y. Tsai,et al.  An Efficient and Accurate Camera Calibration Technique for 3D Machine Vision , 1986, CVPR 1986.

[7]  Michael Potmesil Generating octree models of 3D objects from their silhouettes in a sequence of images , 1987, Comput. Vis. Graph. Image Process..

[8]  Jean-Claude Latombe,et al.  Robot motion planning , 1970, The Kluwer international series in engineering and computer science.

[9]  Richard Szeliski,et al.  Rapid octree construction from image sequences , 1993 .

[10]  Karl Rohr,et al.  Incremental recognition of pedestrians from image sequences , 1993, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[11]  A. Laurentini,et al.  The Visual Hull Concept for Silhouette-Based Image Understanding , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Arthur E. Chapman,et al.  Analysis and synthesis of human movement , 1994 .

[13]  Richard M. Murray,et al.  A Mathematical Introduction to Robotic Manipulation , 1994 .

[14]  Gang Xu,et al.  Tracking Human Body Motion Based on a Stick Figure Model , 1994, J. Vis. Commun. Image Represent..

[15]  Alex Pentland,et al.  Pfinder: real-time tracking of the human body , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[16]  Larry S. Davis,et al.  3-D model-based tracking of humans in action: a multi-view approach , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[17]  Alex Pentland,et al.  Pfinder: Real-Time Tracking of the Human Body , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  Vladimir Pavlovic,et al.  Visual Interpretation of Hand Gestures for Human-Computer Interaction: A Review , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Jiang Yu Zheng,et al.  A model based approach in extracting and generating human motion , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[20]  Sebastian Thrun,et al.  Learning Metric-Topological Maps for Indoor Mobile Robot Navigation , 1998, Artif. Intell..

[21]  Larry S. Davis,et al.  W/sup 4/: Who? When? Where? What? A real time system for detecting and tracking people , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[22]  Jitendra Malik,et al.  Tracking people with twists and exponential maps , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[23]  Alex Pentland,et al.  Real-Time American Sign Language Recognition Using Desk and Wearable Computer Based Video , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[24]  S. Aign,et al.  Overview of the MPEG-4 Standard and Error Resilience Investigations , 1998 .

[25]  Daniel Thalmann,et al.  Controlling and Efficient coding of MPEG-4 Compliant Avatars , 1999 .

[26]  Michael Gleicher,et al.  Animation from observation: Motion capture and motion editing , 1999, COMG.

[27]  Dariu Gavrila,et al.  The Visual Analysis of Human Movement: A Survey , 1999, Comput. Vis. Image Underst..

[28]  Alberto Menache,et al.  Understanding Motion Capture for Computer Animation and Video Games , 1999 .

[29]  Jake K. Aggarwal,et al.  Human Motion Analysis: A Review , 1999, Comput. Vis. Image Underst..

[30]  Takeo Kanade,et al.  A real time system for robust 3D voxel reconstruction of human motions , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[31]  Ramesh Raskar,et al.  Image-based visual hulls , 2000, SIGGRAPH.

[32]  Daniel Thalmann,et al.  Real-time animation and motion capture in Web human director (WHD) , 2000, VRML '00.

[33]  Pascal Fua,et al.  Skeleton-based motion capture for robust reconstruction of human motion , 2000, Proceedings Computer Animation 2000.

[34]  Peter Eisert,et al.  Automatic reconstruction of stationary 3-D objects from multiple uncalibrated camera views , 2000, IEEE Trans. Circuits Syst. Video Technol..

[35]  Mohan M. Trivedi,et al.  Articulated body posture estimation from multi-camera voxel data , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[36]  Wojciech Matusik,et al.  Polyhedral Visual Hulls for Real-Time Rendering , 2001, Rendering Techniques.

[37]  Benjamin Lok,et al.  Online model reconstruction for interactive virtual environments , 2001, I3D '01.

[38]  Sebastian Weik,et al.  Hierarchical 3D Pose Estimation for Articulated Human Body Models from a Sequence of Volume Data , 2001, RobVis.

[39]  Andrea Bottino,et al.  A Silhouette Based Technique for the Reconstruction of Human Movement , 2001, Comput. Vis. Image Underst..

[40]  Pascal Fua,et al.  Tracking and Modeling People in Video Sequences , 2001, Comput. Vis. Image Underst..

[41]  Nikolaos Grammalidis,et al.  Estimating body animation parameters from depth images using analysis by synthesis , 2001, Proceedings Second International Workshop on Digital and Computational Video.

[42]  Hans-Peter Seidel,et al.  Combining 2D feature tracking and volume reconstruction for online video-based human motion capture , 2002, 10th Pacific Conference on Computer Graphics and Applications, 2002. Proceedings..

[43]  Hans-Peter Seidel,et al.  Multi-Layer Skeleton Fitting for Online Human Motion Capture , 2002, VMV.

[44]  Jason P. Luck,et al.  RealTime Markerless Motion Tracking Using Linked Kinematic Chains , 2002, JCIS.

[45]  Michael G. Strintzis,et al.  A gesture recognition system using 3D data , 2002, Proceedings. First International Symposium on 3D Data Processing Visualization and Transmission.

[46]  Kiriakos N. Kutulakos,et al.  A Theory of Shape by Space Carving , 2000, International Journal of Computer Vision.