View independent human body pose estimation from a single perspective image

Recovering the 3D coordinates of various joints of the human body from an image is a critical first step for several model-based human tracking and optical motion capture systems. Unlike previous approaches that have used a restrictive camera model or assumed a calibrated camera, our work deals with the general case of a perspective uncalibrated camera and is thus well suited for archived video. The input to the system is an image of the human body and correspondences of several body landmarks, while the output is the set of 3D coordinates of the landmarks in a body-centric coordinate system. Using ideas from 3D model based invariants, we set up a polynomial system of equations in the unknown head pitch, yaw and roll angles. If we are able to make the often-valid assumption that the torso twist is small, there are finite numbers of solutions to the head-orientation that can be computed readily. Once the head orientation is computed, the epipolar geometry of the camera is recovered, leading to solutions to the 3D joint positions. Results are presented on synthetic and real images.

[1]  S. Fomin,et al.  Elements of the Theory of Functions and Functional Analysis , 1961 .

[2]  Miss A.O. Penney (b) , 1974, The New Yale Book of Quotations.

[3]  Hsi-Jian Lee,et al.  Determination of 3D human body postures from a single view , 1985, Comput. Vis. Graph. Image Process..

[4]  J. Rissanen Stochastic Complexity and Modeling , 1986 .

[5]  R. Okafor Maximum likelihood estimation from incomplete data , 1987 .

[6]  M. Bertero,et al.  Ill-posed problems in early vision , 1988, Proc. IEEE.

[7]  Richard I. Hartley,et al.  Chirality , 2004, International Journal of Computer Vision.

[8]  Robert A. Jacobs,et al.  Hierarchical Mixtures of Experts and the EM Algorithm , 1993, Neural Computation.

[9]  Tomaso A. Poggio,et al.  Regularization Theory and Neural Networks Architectures , 1995, Neural Computation.

[10]  Takeo Kanade,et al.  Model-based tracking of self-occluding articulated objects , 1995, Proceedings of IEEE International Conference on Computer Vision.

[11]  Larry S. Davis,et al.  Tracking of humans in action: a 3-D model-based approach , 1996 .

[12]  L. Davis,et al.  el-based tracking of humans in action: , 1996 .

[13]  Alex Pentland,et al.  Pfinder: Real-Time Tracking of the Human Body , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  James W. Davis,et al.  The representation and recognition of human movement using temporal templates , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[15]  Brian Sallans,et al.  A Hierarchical Community of Experts , 1999, Learning in Graphical Models.

[16]  Larry S. Davis,et al.  Ghost: a human body part labeling system using silhouettes , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[17]  Jitendra Malik,et al.  Tracking people with twists and exponential maps , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[18]  Geoffrey E. Hinton,et al.  A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants , 1998, Learning in Graphical Models.

[19]  Matthew Brand,et al.  Shadow puppetry , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[20]  Adrian Hilton Towards model-based capture of a persons shape, appearance and motion , 1999, Proceedings IEEE International Workshop on Modelling People. MPeople'99.

[21]  William T. Freeman,et al.  Bayesian Reconstruction of 3D Human Motion from Single-Camera Video , 1999, NIPS.

[22]  Rómer Rosales,et al.  3D trajectory recovery for tracking multiple objects and trajectory guided recognition of actions , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[23]  Rómer Rosales,et al.  Specialized mappings and the estimation of human body pose from a single image , 2000, Proceedings Workshop on Human Motion.

[24]  David J. Fleet,et al.  Stochastic Tracking of 3D Human Figures Using 2D Image Motion , 2000, ECCV.

[25]  Andrew Blake,et al.  Articulated body motion capture by annealed particle filtering , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[26]  Rómer Rosales,et al.  Inferring body pose without tracking body parts , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[27]  Ioannis A. Kakadiaris,et al.  Estimating anthropometry and pose from a single image , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[28]  Yang Song,et al.  Towards detection of human motion , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[29]  Ioannis A. Kakadiaris,et al.  Estimating Anthropometry and Pose from a Single Uncalibrated Image , 2001, Comput. Vis. Image Underst..

[30]  Isaac Weiss,et al.  Model-Based Recognition of 3D Objects from Single Images , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[31]  Thomas B. Moeslund,et al.  A Survey of Computer Vision-Based Human Motion Capture , 2001, Comput. Vis. Image Underst..

[32]  Rómer Rosales,et al.  Estimating 3D body pose using uncalibrated cameras , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[33]  Vladimir M. Zatsiorsky,et al.  Kinetics of Human Motion , 2002 .

[34]  Cristian Sminchisescu,et al.  Kinematic jump processes for monocular 3D human tracking , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[35]  Tieniu Tan,et al.  Recent developments in human motion analysis , 2003, Pattern Recognit..

[36]  Trevor Darrell,et al.  Inferring 3D structure with a statistical image-based shape model , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.