View independent human body pose estimation from a single perspective image

Recovering the 3D coordinates of various joints of the human body from an image is a critical first step for several model-based human tracking and optical motion capture systems. Unlike previous approaches that have used a restrictive camera model or assumed a calibrated camera, our work deals with the general case of a perspective uncalibrated camera and is thus well suited for archived video. The input to the system is an image of the human body and correspondences of several body landmarks, while the output is the set of 3D coordinates of the landmarks in a body-centric coordinate system. Using ideas from 3D model based invariants, we set up a polynomial system of equations in the unknown head pitch, yaw and roll angles. If we are able to make the often-valid assumption that the torso twist is small, there are finite numbers of solutions to the head-orientation that can be computed readily. Once the head orientation is computed, the epipolar geometry of the camera is recovered, leading to solutions to the 3D joint positions. Results are presented on synthetic and real images.

[1]  David J. Fleet,et al.  Stochastic Tracking of 3D Human Figures Using 2D Image Motion , 2000, ECCV.

[2]  Vladimir M. Zatsiorsky,et al.  Kinetics of Human Motion , 2002 .

[3]  Matthew Brand,et al.  Shadow puppetry , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[4]  Lancelot Law Whyte Chirality , 1990, The Lancet.

[5]  Hsi-Jian Lee,et al.  Determination of 3D human body postures from a single view , 1985, Comput. Vis. Graph. Image Process..

[6]  Cristian Sminchisescu,et al.  Kinematic jump processes for monocular 3D human tracking , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[7]  M. Bertero,et al.  Ill-posed problems in early vision , 1988, Proc. IEEE.

[8]  Brian Sallans,et al.  A Hierarchical Community of Experts , 1999, Learning in Graphical Models.

[9]  Rómer Rosales,et al.  Specialized mappings and the estimation of human body pose from a single image , 2000, Proceedings Workshop on Human Motion.

[10]  Miss A.O. Penney (b) , 1974, The New Yale Book of Quotations.

[11]  Larry S. Davis,et al.  Ghost: a human body part labeling system using silhouettes , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[12]  Ioannis A. Kakadiaris,et al.  Estimating Anthropometry and Pose from a Single Uncalibrated Image , 2001, Comput. Vis. Image Underst..

[13]  Isaac Weiss,et al.  Model-Based Recognition of 3D Objects from Single Images , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Thomas B. Moeslund,et al.  A Survey of Computer Vision-Based Human Motion Capture , 2001, Comput. Vis. Image Underst..

[15]  Rómer Rosales,et al.  Estimating 3D body pose using uncalibrated cameras , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[16]  Alex Pentland,et al.  Pfinder: Real-Time Tracking of the Human Body , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Tieniu Tan,et al.  Recent developments in human motion analysis , 2003, Pattern Recognit..

[18]  Andrew Blake,et al.  Articulated body motion capture by annealed particle filtering , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[19]  Rómer Rosales,et al.  Inferring body pose without tracking body parts , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[20]  Adrian Hilton Towards model-based capture of a persons shape, appearance and motion , 1999, Proceedings IEEE International Workshop on Modelling People. MPeople'99.

[21]  William T. Freeman,et al.  Bayesian Reconstruction of 3D Human Motion from Single-Camera Video , 1999, NIPS.

[22]  J. Rissanen Stochastic Complexity and Modeling , 1986 .

[23]  S. Fomin,et al.  Elements of the Theory of Functions and Functional Analysis , 1961 .

[24]  Ioannis A. Kakadiaris,et al.  Estimating anthropometry and pose from a single image , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[25]  Jitendra Malik,et al.  Tracking people with twists and exponential maps , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[26]  Geoffrey E. Hinton,et al.  A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants , 1998, Learning in Graphical Models.

[27]  Rómer Rosales,et al.  3D trajectory recovery for tracking multiple objects and trajectory guided recognition of actions , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[28]  James W. Davis,et al.  The representation and recognition of human movement using temporal templates , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[29]  R. Okafor Maximum likelihood estimation from incomplete data , 1987 .

[30]  Yang Song,et al.  Towards detection of human motion , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[31]  Larry S. Davis,et al.  Tracking of humans in action: a 3-D model-based approach , 1996 .

[32]  Tomaso A. Poggio,et al.  Regularization Theory and Neural Networks Architectures , 1995, Neural Computation.

[33]  L. Davis,et al.  el-based tracking of humans in action: , 1996 .

[34]  Takeo Kanade,et al.  Model-based tracking of self-occluding articulated objects , 1995, Proceedings of IEEE International Conference on Computer Vision.

[35]  Robert A. Jacobs,et al.  Hierarchical Mixtures of Experts and the EM Algorithm , 1993, Neural Computation.

[36]  Trevor Darrell,et al.  Inferring 3D structure with a statistical image-based shape model , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.