Full body modeling from video sequences

Synthetic modeling of human bodies and the simulation of mot ion is a longstanding problem in modeling and animation and much work has to be i nv stigated before a near-realistic performance can be achieved. At prese nt it takes an experienced graphic designer a long work-flow to build a complete and real istic model that closely resembles a specific person. Our goal is to automate the proce ss. This work pertains to the automation of synchronized multi-camera calibratio n and investigates in the reconstruction, modeling and tracking of human body models from video sequences. Such processes have many applications, as entertainment, s por s medicine/athletic training, and biometry. One topic of this work is the implementation of the fully auto matic geometric calibration software. The method needs a wand moving through object space. Dependent on the volume size and accuracy requirements different wands c an be used. Assuming a static multi station network every frame serves as snapshot for builing a well defined point cloud. We have developed a scene description language to achieve the most flexible way of data collection. Moreover, the data can be dis tributed over an internet network and different operating systems can be introduced. The idenfication of the points is established through standard point detection ope rators implemented through scripts, which are running locally on every workstation in t he intranet. These identified points together with the frame number are used to establi h relative orientations and to setup a network. These approximations are taken to set up a bundle adjustment network, allowing multiple command line switches to lead th e process. The approximations are first viewed in a OpenGL 3D viewer and can be analy sed for further runs or stopped, if an error occured. This first mile stone in t he geometric calibration procedure allows the user to inspect the correctness of the c ollection process. The unified bundle adjustment determines the network and output s all relevant geometric calibration protocols as text and graphical report. The image sequence character and the knowledge of additiona l spatial information about the “empty scene” and/or the human body approximation ll w to make useful combinations with useful results. The spatial and temporal ch racter of the video sequence is fully used. The foreground/background subtracti on algorithm is dependent on the base line of the multi camera setup. If there is a small b aseline between neighbouring sensors we use disparity maps to define the foregroun d and background. In an all around network with a small number of cameras where aut omatic identifica-

[1]  Horst A. Beyer,et al.  System Calibration Through Self-Calibration , 2001 .

[2]  Zhengyou Zhang,et al.  A Flexible New Technique for Camera Calibration , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Pascal Fua,et al.  3D Human Body Tracking Using Deterministic Temporal Motion Models , 2004, ECCV.

[4]  Bohyung Han,et al.  SEQUENTIAL KERNEL DENSITY APPROXIMATION THROUGH MODE PROPAGATION: APPLICATIONS TO BACKGROUND MODELING , 2004 .

[5]  Yang Song,et al.  Towards detection of human motion , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[6]  Emmanuel P. Baltsavias,et al.  Multiphoto geometrically constrained matching , 1991 .

[7]  Edward M. Mikhail,et al.  Observations And Least Squares , 1983 .

[8]  Sébastien Roy,et al.  Stereo Without Epipolar Lines: A Maximum-Flow Formulation , 1999, International Journal of Computer Vision.

[9]  Pascal Fua,et al.  Human Shape and Motion Recovery Using Animation Models , 2000 .

[10]  Björn Stenger,et al.  A Single Camera Motion Capture System for Human-Computer Interaction , 2008, IEICE Trans. Inf. Syst..

[11]  Fabio Remondino Image-based modeling for object and human reconstruction , 2006 .

[12]  Marc Rioux,et al.  Nefertiti: a query by content system for three-dimensional model and image databases management , 1999, Image Vis. Comput..

[13]  Demetri Terzopoulos,et al.  Snakes: Active contour models , 2004, International Journal of Computer Vision.

[14]  Nebojsa Jojic,et al.  Tracking articulated self - occluding objects in dense disparity maps , 1999 .

[15]  Roger Y. Tsai,et al.  A versatile camera calibration technique for high-accuracy 3D machine vision metrology using off-the-shelf TV cameras and lenses , 1987, IEEE J. Robotics Autom..

[16]  Takeo Kanade,et al.  A Multiple-Baseline Stereo , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Lior Wolf,et al.  Sequence-to-Sequence Self Calibration , 2002, ECCV.

[18]  Sergio A. Velastin,et al.  Automatic congestion detection system for underground platforms , 2001, Proceedings of 2001 International Symposium on Intelligent Multimedia, Video and Speech Processing. ISIMP 2001 (IEEE Cat. No.01EX489).

[19]  R. Plankers,et al.  Articulated soft objects for video-based body modeling , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[20]  Larry S. Davis,et al.  Tracking of humans in action: a 3-D model-based approach , 1996 .

[21]  R. Plänkers,et al.  Human body modeling from video sequences , 2001 .

[22]  C. Fraser EVOLUTION OF NETWORK ORIENTATION PROCEDURES , 2006 .

[23]  Masahiko Yachida,et al.  Posture estimation using structure and motion models , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[24]  Henning G.E. Wolf Structured lighting for upgrading 2D vision systems to 3D , 1996, Other Conferences.

[25]  Xiangyang Ju,et al.  Individualising Human Animation Models , 2001, Eurographics.

[26]  Aimin Hao,et al.  View-invariant action recognition using interest points , 2008, MIR '08.

[27]  Pascal Fua,et al.  Markerless Full Body Shape and Motion Capture from Video Sequences , 2002 .

[28]  S. J. Marshall,et al.  Human body 3D imaging by speckle texture projection photogrammetry , 2000 .

[29]  A. Gruen ADAPTIVE LEAST SQUARES CORRELATION: A POWERFUL IMAGE MATCHING TECHNIQUE , 1985 .

[30]  Li Zhang Automatic Digital Surface Model (DSM) generation from linear array images , 2005 .

[31]  Alex Pentland,et al.  Pfinder: Real-Time Tracking of the Human Body , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[32]  Jochen Willneff,et al.  Photogrammetric measurement of deformations of horse hoof horn capsules , 2000, IS&T/SPIE Electronic Imaging.

[33]  Hans-Gerd Maas Image sequence based automatic multi-camera system calibration techniques 1 Revised version of a pap , 1999 .

[34]  A. Laurentini,et al.  The Visual Hull Concept for Silhouette-Based Image Understanding , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[35]  Kiriakos N. Kutulakos,et al.  A Theory of Shape by Space Carving , 2000, International Journal of Computer Vision.

[36]  Andrew Zisserman,et al.  Feature Based Methods for Structure and Motion Estimation , 1999, Workshop on Vision Algorithms.

[37]  Cristian Sminchisescu,et al.  Discriminative density propagation for 3D human motion estimation , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[38]  S. LaValle,et al.  Motion Planning , 2008, Springer Handbook of Robotics.

[39]  Kathleen M. Robinette,et al.  The CAESAR project: a 3-D surface anthropometry survey , 1999, Second International Conference on 3-D Digital Imaging and Modeling (Cat. No.PR00062).

[40]  W. Eric L. Grimson,et al.  Adaptive background mixture models for real-time tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[41]  Andrew Blake,et al.  Articulated body motion capture by annealed particle filtering , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[42]  Xiangyang Ju,et al.  Realistic human animation using scanned data , 2000 .

[43]  Takashi Totsuka,et al.  Torque-based recursive filtering approach to the recovery of 3D articulated motion from image sequences , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[44]  Pascal Fua,et al.  Tracking and Modeling People in Video Sequences , 2001, Comput. Vis. Image Underst..

[45]  David J. Fleet,et al.  Monocular 3-D Tracking of the Golf Swing , 2005, CVPR.

[46]  Amit K. Roy-Chowdhury,et al.  A measure of deformability of shapes, with applications to human motion analysis , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[47]  Roberto Manduchi,et al.  Hybrid joint-separable multibody tracking , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[48]  Masanobu Yamamoto,et al.  Gesture recognition using character recognition techniques on two-dimensional eigenspace , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[49]  Reinhard Koch,et al.  Visual Modeling with a Hand-Held Camera , 2004, International Journal of Computer Vision.

[50]  Steven M. Seitz,et al.  Photorealistic Scene Reconstruction by Voxel Coloring , 1997, International Journal of Computer Vision.

[51]  Rómer Rosales,et al.  Inferring body pose without tracking body parts , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).