Tracking people in three dimensions using a hierarchical model of dynamics

We propose a novel hierarchical model of human dynamics for view independent tracking of a human figure in monocular video sequences. The model is trained using real data from a collection of people. The top of the hierarchy contains information about the whole body. The lower levels of the hierarchy contain more detailed information about possible poses of some subpart of the body. In this article we describe our model and present experiments that show we can recover 3D human figures from 2D images in a view independent manner, and also track people the system has not been trained on. q 2002 Elsevier Science B.V. All rights reserved.

[1]  L. Rabiner,et al.  An introduction to hidden Markov models , 1986, IEEE ASSP Magazine.

[2]  Heinrich Niemann,et al.  Adaptive Road Recognition and Ego-state Tracking in the Presence of Obstacles , 1998, International Journal of Computer Vision.

[3]  David C. Hogg Model-based vision: a program to see a walking person , 1983, Image Vis. Comput..

[4]  Alan Watt,et al.  3D Computer Graphics , 1993 .

[5]  R. Y. Tsai,et al.  An Efficient and Accurate Camera Calibration Technique for 3D Machine Vision , 1986, CVPR 1986.

[6]  Mansoor Sarhadi,et al.  Reconstructing 3D Pose and Motion from a Single Camera View , 1998, BMVC.

[7]  Alex Pentland,et al.  Real-time American Sign Language recognition from video using hidden Markov models , 1995 .

[8]  Biing-Hwang Juang,et al.  Hidden Markov Models for Speech Recognition , 1991 .

[9]  Shaogang Gong,et al.  Learning Prior and Observation Augmented Density Models for Behaviour Recognition , 1999, BMVC.

[10]  Jitendra Malik,et al.  Tracking people with twists and exponential maps , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[11]  David C. Hogg,et al.  Improving Specificity in PDMs using a Hierarchical Approach , 1997, BMVC.

[12]  John H. L. Hansen,et al.  Discrete-Time Processing of Speech Signals , 1993 .

[13]  Pietro Perona,et al.  Monocular tracking of the human arm in 3D , 1995, Proceedings of IEEE International Conference on Computer Vision.

[14]  Ioannis A. Kakadiaris,et al.  Model-based estimation of 3D human motion with occlusion based on active multi-viewpoint selection , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[15]  Larry S. Davis,et al.  3-D model-based tracking of humans in action: a multi-view approach , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[16]  Shaogang Gong,et al.  A Dynamic 3D Human Model using Hybrid 2D-3D Representations in Hierarchical PCA Space , 1999, BMVC.

[17]  Michael Isard,et al.  CONDENSATION—Conditional Density Propagation for Visual Tracking , 1998, International Journal of Computer Vision.

[18]  Ramesh C. Jain,et al.  Lip Posture Estimation using Kinematically Constrained Mixture Models , 1998, BMVC.

[19]  Ralph R. Martin,et al.  Adding and Subtracting Eigenspaces , 1999, BMVC.