Data-Free Prior Model for Upper Body Pose Estimation and Tracking

Video based human body pose estimation seeks to estimate the human body pose from an image or a video sequence, which captures a person exhibiting some activities. To handle noise and occlusion, a pose prior model is often constructed and is subsequently combined with the pose estimated from the image data to achieve a more robust body pose tracking. Various body prior models have been proposed. Most of them are data-driven, typically learned from 3D motion capture data. In addition to being expensive and time-consuming to collect, these data-based prior models cannot generalize well to activities and subjects not present in the motion capture data. To alleviate this problem, we propose to learn the prior model from anatomic, biomechanics, and physical constraints, rather than from the motion capture data. For this, we propose methods that can effectively capture different types of constraints and systematically encode them into the prior model. Experiments on benchmark data sets show the proposed prior model, compared with data-based prior models, achieves comparable performance for body motions that are present in the training data. It, however, significantly outperforms the data-based prior models in generalization to different body motions and to different subjects.

[1]  Cristian Sminchisescu,et al.  Covariance scaled sampling for monocular 3D body tracking , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[2]  Rui Li,et al.  Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Workshops.

[3]  Trevor Darrell,et al.  Rank priors for continuous non-linear dimensionality reduction , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  David J. Fleet,et al.  Monocular 3D tracking of the golf swing , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[5]  David J. Fleet,et al.  Priors for people tracking from small training sets , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[6]  Sidharth Bhatia,et al.  Tracking loose-limbed people , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[7]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .

[8]  Ashutosh Saxena,et al.  Robotic Grasping of Novel Objects using Vision , 2008, Int. J. Robotics Res..

[9]  David A. Forsyth,et al.  Strike a pose: tracking people by finding stylized poses , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[10]  Andrew P. Witkin,et al.  Spacetime constraints , 1988, SIGGRAPH.

[11]  David J. Fleet,et al.  Dynamical binary latent variable models for 3D human pose tracking , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[12]  Alex Pentland,et al.  Dynamic models of human motion , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[13]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[14]  David J. Fleet,et al.  Stochastic Tracking of 3D Human Figures Using 2D Image Motion , 2000, ECCV.

[15]  Odest Chadwicke Jenkins,et al.  Physical simulation for probabilistic motion tracking , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Ankur Agarwal,et al.  Recovering 3D human pose from monocular images , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Trevor Darrell,et al.  Sparse probabilistic regression for activity-independent human pose inference , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Daniel P. Huttenlocher,et al.  Beyond trees: common-factor models for 2D human pose recovery , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[19]  Geoffrey E. Hinton,et al.  Modeling Human Motion Using Binary Latent Variables , 2006, NIPS.

[20]  Michael Beetz,et al.  Accurate Human Motion Capture Using an Ergonomics-Based Anthropometric Human Model , 2008, AMDO.

[21]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[22]  A. Steindler Kinesiology of the Human Body Under Normal and Pathological Conditions , 1977 .

[23]  David A. Forsyth,et al.  Finding and tracking people from the bottom up , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[24]  David J. Fleet,et al.  Topologically-constrained latent variable models , 2008, ICML '08.

[25]  Olivier Bernier,et al.  Multicues 3D Monocular Upper Body Tracking Using Constrained Belief Propagation , 2007, BMVC.

[26]  Hans-Peter Seidel,et al.  Nonparametric Density Estimation with Adaptive, Anisotropic Kernels for Human Motion Tracking , 2007, Workshop on Human Motion.

[27]  David J. Fleet,et al.  3D People Tracking with Gaussian Process Dynamical Models , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[28]  Cristian Sminchisescu,et al.  Latent structured models for human pose estimation , 2011, 2011 International Conference on Computer Vision.

[29]  Daniel P. Huttenlocher,et al.  Pictorial Structures for Object Recognition , 2004, International Journal of Computer Vision.

[30]  Trevor Darrell,et al.  Fast pose estimation with parameter-sensitive hashing , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[31]  David J. Fleet,et al.  The Kneed Walker for human pose tracking , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[32]  Gang Hua,et al.  Tracking articulated body by dynamic Markov network , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[33]  Rui Li,et al.  3D Human Motion Tracking with a Coordinated Mixture of Factor Analyzers , 2009, International Journal of Computer Vision.

[34]  Rui Li,et al.  Simultaneous Learning of Nonlinear Manifold and Dynamical Models for High-dimensional Time Series , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[35]  Stuart J. Russell,et al.  Dynamic bayesian networks: representation, inference and learning , 2002 .

[36]  David Demirdjian Enforcing Constraints for Human Body Tracking , 2003, 2003 Conference on Computer Vision and Pattern Recognition Workshop.

[37]  Wen Gao,et al.  Virtual face image generation for illumination and pose insensitive face recognition , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[38]  Michael J. Black,et al.  HumanEva: Synchronized Video and Motion Capture Dataset and Baseline Algorithm for Evaluation of Articulated Human Motion , 2010, International Journal of Computer Vision.

[39]  Danica Kragic,et al.  Tracking people interacting with objects , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[40]  Qiang Ji,et al.  Switching Gaussian Process Dynamic Models for simultaneous composite motion tracking and recognition , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[41]  Baoxin Li,et al.  Learning Motion Correlation for Tracking Articulated Human Body with a Rao-Blackwellised Particle Filter , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[42]  Yang Wang,et al.  Multiple Tree Models for Occlusion and Spatial Constraints in Human Pose Estimation , 2008, ECCV.

[43]  Michael Beetz,et al.  Tracking humans interacting with the environment using efficient hierarchical sampling and layered observation models , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.