Tracking Human Body Pose on a Learned Smooth Space

Particle filtering is a popular method used in systems for tracking human body pose in video. One key difficulty in using particle filtering is caused by the curse of dimensionality: generally a very large number of particles is require d to adequately approximate the underlying pose distributio n in a high-dimensional state space. Although the number of degrees of freedom in the human body is quite large, in reality, the subset of allowable configurations in state spaceis generally restricted by human biomechanics, and the trajec tories in this allowable subspace tend to be smooth. Therefore, a framework is proposed to learn a low-dimensional representation of the high-dimensional human poses state space. This mapping can be learned using a Gaussian Process Latent Variable Model (GPLVM) framework. One important advantage of the GPLVM framework is that both the mapping to, and mapping from the embedded space are smooth; this facilitates sampling in the low-dimensional space, and samples generated in the low-dimensional embedded space are easily mapped back into the original highdimensional space. Moreover, human body poses that are similar in the original space tend to be mapped close to each other in the embedded space; this property can be exploited when sampling in the embedded space. The proposed framework is tested in tracking 2D human body pose using a Scaled Prismatic Model. Experiments on real life video sequences demonstrate the strength of the approach. In comparison with the Multiple Hypothesis Tracking and the standard Condensation algorithm, the proposed algorithm is able to maintain tracking reliably throughout the long test sequences. It also handles singularity and self oc clusion robustly.

[1]  James M. Rehg,et al.  Singularity analysis for articulated object tracking , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[2]  Christopher K. I. Williams Prediction with Gaussian Processes: From Linear Regression to Linear Prediction and Beyond , 1999, Learning in Graphical Models.

[3]  D. Mackay,et al.  Introduction to Gaussian processes , 1998 .

[4]  Michael Isard,et al.  ICONDENSATION: Unifying Low-Level and High-Level Tracking in a Stochastic Framework , 1998, ECCV.

[5]  Michael Isard,et al.  A Smoothing Filter for CONDENSATION , 1998, ECCV.

[6]  James M. Rehg,et al.  A multiple hypothesis approach to figure tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[7]  Andrew Blake,et al.  A Probabilistic Exclusion Principle for Tracking Multiple Objects , 2000, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[8]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[9]  Rómer Rosales,et al.  Specialized mappings and the estimation of human body pose from a single image , 2000, Proceedings Workshop on Human Motion.

[10]  David J. Fleet,et al.  Stochastic Tracking of 3D Human Figures Using 2D Image Motion , 2000, ECCV.

[11]  Andrew Blake,et al.  Articulated body motion capture by annealed particle filtering , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[12]  David A. Forsyth,et al.  How Does CONDENSATION Behave with a Finite Number of Samples? , 2000, ECCV.

[13]  David J. Fleet,et al.  Stochastic Tracking of 3D Human Figures Using 2D Image Motion , 2000, ECCV.

[14]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[15]  J. Sullivan,et al.  Guiding random particles by deterministic search , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[16]  Cristian Sminchisescu,et al.  Covariance scaled sampling for monocular 3D body tracking , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[17]  Michael J. Black,et al.  Learning image statistics for Bayesian tracking , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[18]  Andrew Blake,et al.  Probabilistic tracking in a metric space , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[19]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[20]  Matthew Brand,et al.  Charting a Manifold , 2002, NIPS.

[21]  Neil D. Lawrence,et al.  Fast Sparse Gaussian Process Methods: The Informative Vector Machine , 2002, NIPS.

[22]  Trevor Darrell,et al.  Bayesian Articulated Tracking Using Single Frame Pose Sampling , 2003 .

[23]  Tieniu Tan,et al.  Recent developments in human motion analysis , 2003, Pattern Recognit..

[24]  Neil D. Lawrence,et al.  Gaussian Process Latent Variable Models for Visualisation of High Dimensional Data , 2003, NIPS.

[25]  Jessica K. Hodgins,et al.  Synthesizing physically realistic human motion in low-dimensional, behavior-specific spaces , 2004, ACM Trans. Graph..

[26]  Michael Isard,et al.  Bayesian Object Localisation in Images , 2001, International Journal of Computer Vision.

[27]  Aaron Hertzmann,et al.  Style-based inverse kinematics , 2004, ACM Trans. Graph..

[28]  Michael Isard,et al.  CONDENSATION—Conditional Density Propagation for Visual Tracking , 1998, International Journal of Computer Vision.

[29]  A. Elgammal,et al.  Inferring 3D body pose from silhouettes using activity manifold learning , 2004, CVPR 2004.

[30]  Cristian Sminchisescu,et al.  Generative modeling for continuous non-linearly embedded visual inference , 2004, ICML.

[31]  Neil D. Lawrence,et al.  Probabilistic Non-linear Principal Component Analysis with Gaussian Process Latent Variable Models , 2005, J. Mach. Learn. Res..