Tracking and Modeling People in Video Sequences

Tracking and modeling people from video sequences has become an increasingly important research topic, with applications including animation, surveillance, and sports medicine. In this paper, we propose a model-based 3-D approach to recovering both body shape and motion. It takes advantage of a sophisticated animation model to achieve both robustness and realism. Stereo sequences of people in motion serve as input to our system. From these, we extract a 212-D description of the scene and, optionally, silhouette edges. We propose an integrated framework to fit the model and to track the person's motion. The environment does not have to be engineered. We recover not only the motion but also a full animation model closely resembling the subject. We present results of our system on real sequences and we show the generic model adjusting to the person and following various kinds of motion.

[1]  Thomas B. Moeslund,et al.  A Survey of Computer Vision-Based Human Motion Capture , 2001, Comput. Vis. Image Underst..

[2]  Olivier D. Faugeras,et al.  3D articulated models and multi-view tracking with silhouettes , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[3]  Pascal Fua,et al.  Local and Global Skeleton Fitting Techniques for Optical Motion Capture , 1998, CAPTECH.

[4]  Wei Sun,et al.  Layered animation of captured data , 2001, The Visual Computer.

[5]  Matthew Brand,et al.  Shadow puppetry , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[6]  Daniel Thalmann,et al.  Fast realistic human body deformations for animation and VR applications , 1996, Proceedings of CG International '96.

[7]  Ioannis A. Kakadiaris,et al.  3D human body model acquisition from multiple views , 1995, Proceedings of IEEE International Conference on Computer Vision.

[8]  James W. Davis,et al.  A Robust Human-Silhouette Extraction Technique for Interactive Virtual Environments , 1998, CAPTECH.

[9]  Olivier D. Faugeras,et al.  Using Extremal Boundaries for 3-D Object Modeling , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Larry S. Davis,et al.  3-D model-based tracking of humans in action: a multi-view approach , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[11]  P. Fua From Multiple Stereo Views to Multiple 3-D Surfaces , 2022 .

[12]  Pascal Fua,et al.  Using model-driven bundle-adjustment to model heads from raw video sequences , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[13]  D. Thalmann,et al.  Local and Global Skeleton Fitting Techniques for Optical Motion Capture , Modeling and Motion Capture Techniques for Virtual Environments , 1998 .

[14]  W. Press,et al.  Numerical Recipes: The Art of Scientific Computing , 1987 .

[15]  Michel Dhome,et al.  Human Body Tracking by Monocular Vision , 1996, ECCV.

[16]  William A. Barrett,et al.  Intelligent scissors for image composition , 1995, SIGGRAPH.

[17]  Ieee Xplore,et al.  IEEE Transactions on Pattern Analysis and Machine Intelligence Information for Authors , 2022, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Arthur Gelb,et al.  Applied Optimal Estimation , 1974 .

[19]  C. Radke International Conference on Computer Design , 2022 .

[20]  William H. Press,et al.  Numerical recipes in C. The art of scientific computing , 1987 .

[21]  Wei Sun,et al.  Virtual people: capturing human models to populate virtual worlds , 1999, Proceedings Computer Animation 1999.

[22]  M. Isard,et al.  Statistical models of visual shape and motion , 1998, Philosophical Transactions of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences.

[23]  Daniel Thalmann,et al.  Interactive Shape Design Using Metaballs and Splines , 1995 .

[24]  Ioannis A. Kakadiaris,et al.  Inferring 2D Object Structure from the Deformation of Apparent Contours , 1997, Comput. Vis. Image Underst..

[25]  Camillo J. Taylor,et al.  Reconstruction of articulated objects from point correspondences in a single uncalibrated image , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[26]  Ioannis A. Kakadiaris,et al.  Estimating anthropometry and pose from a single image , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[27]  James F. Blinn,et al.  A generalization of algebraic surface drawing , 1982, SIGGRAPH.

[28]  Pascal Fua,et al.  LEAST SQUARES MATCHING TRACKING ALGORITHM FOR HUMAN BODY MODELING , 2003 .

[29]  Pascal Fua,et al.  Skeleton-based motion capture for robust reconstruction of human motion , 2000, Proceedings Computer Animation 2000.

[30]  Jean Ponce,et al.  Using Geometric Distance Fits for 3-D Object Modeling and Recognition , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[31]  Ioannis A. Kakadiaris,et al.  Estimating Anthropometry and Pose from a Single Uncalibrated Image , 2001, Comput. Vis. Image Underst..

[32]  Dariu Gavrila,et al.  The Visual Analysis of Human Movement: A Survey , 1999, Comput. Vis. Image Underst..

[33]  Régis Vaillant Using Occluding Contours for 3D Object Modeling , 1990, ECCV.

[34]  Jitendra Malik,et al.  Tracking people with twists and exponential maps , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[35]  Pascal Fua,et al.  Reconstructing complex surfaces from multiple stereo views , 1995, Proceedings of IEEE International Conference on Computer Vision.

[36]  Kurt Konolige,et al.  Small Vision Systems: Hardware and Implementation , 1998 .