Backing Off: Hierarchical Decomposition of Activity for 3D Novel Pose Recovery

For model-based 3D human pose estimation, even simple models of the human body lead to high-dimensional state spaces. Where the class of activity is known a priori, lowdimensional activity models learned from training data make possible a thorough and efficient search for the best pose. Conversely, searching for solutions in the full state space places no restriction on the class of motion to be recovered, but is both difficult and expensive. This paper explores a potential middle ground between these approaches, using the hierarchical Gaussian process latent variable model to learn activity at different hierarchical scales within the human skeleton. We show that by training on full-body activity data then descending through the hierarchy in stages and exploring subtrees independently of one another, novel poses may be recovered. Experimental results on motion capture data and monocular video sequences demonstrate the utility of the approach, and comparisons are drawn with existing low-dimensional activity models.

[1]  Adrian Hilton,et al.  A survey of advances in vision-based human motion capture and analysis , 2006, Comput. Vis. Image Underst..

[2]  Larry S. Davis,et al.  3-D model-based tracking of humans in action: a multi-view approach , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[3]  Michael Isard,et al.  Partitioned Sampling, Articulated Objects, and Interface-Quality Hand Tracking , 2000, ECCV.

[4]  David J. Fleet,et al.  3D People Tracking with Gaussian Process Dynamical Models , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[5]  Daniel P. Huttenlocher,et al.  Pictorial Structures for Object Recognition , 2004, International Journal of Computer Vision.

[6]  Neil J. Gordon,et al.  A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking , 2002, IEEE Trans. Signal Process..

[7]  Cristian Sminchisescu,et al.  Generative modeling for continuous non-linearly embedded visual inference , 2004, ICML.

[8]  Neil D. Lawrence,et al.  Hierarchical Gaussian process latent variable models , 2007, ICML '07.

[9]  Neil A. Thacker,et al.  Real-time Body Tracking Using a Gaussian Process Latent Variable Model , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[10]  Ehud Rivlin,et al.  Using Gaussian Process Annealing Particle Filter for 3D Human Tracking , 2008, EURASIP J. Adv. Signal Process..

[11]  Mun Wai Lee,et al.  Proposal maps driven MCMC for estimating human body pose in static images , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[12]  Michael J. Black,et al.  Implicit Probabilistic Models of Human Motion for Synthesis and Tracking , 2002, ECCV.

[13]  David J. Fleet,et al.  Robust Online Appearance Models for Visual Tracking , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Neil D. Lawrence,et al.  Probabilistic Non-linear Principal Component Analysis with Gaussian Process Latent Variable Models , 2005, J. Mach. Learn. Res..

[15]  Deva Ramanan,et al.  Learning to parse images of articulated bodies , 2006, NIPS.

[16]  David J. Fleet,et al.  Temporal motion models for monocular and multiview 3D human body tracking , 2006, Comput. Vis. Image Underst..

[17]  Michael J. Black,et al.  A Quantitative Evaluation of Video-based 3D Person Tracking , 2005, 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.

[18]  Sidharth Bhatia,et al.  Tracking loose-limbed people , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[19]  Miguel Á. Carreira-Perpiñán,et al.  People Tracking with the Laplacian Eigenmaps Latent Variable Model , 2007, NIPS.

[20]  David J. Fleet,et al.  This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE Gaussian Process Dynamical Model , 2007 .

[21]  Cristian Sminchisescu,et al.  Estimating Articulated Human Motion with Covariance Scaled Sampling , 2003, Int. J. Robotics Res..

[22]  Hans-Peter Seidel,et al.  Interacting and Annealing Particle Filters: Mathematics and a Recipe for Applications , 2007, Journal of Mathematical Imaging and Vision.

[23]  Michael Beetz,et al.  Evaluation of Hierarchical Sampling Strategies in 3D Human Pose Estimation , 2008, BMVC.

[24]  Ian D. Reid,et al.  Articulated Body Motion Capture by Stochastic Search , 2005, International Journal of Computer Vision.

[25]  Rui Li,et al.  Monocular Tracking of 3D Human Motion with a Coordinated Mixture of Factor Analyzers , 2006, ECCV.

[26]  David J. Fleet,et al.  Priors for people tracking from small training sets , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[27]  Rui Li,et al.  Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Workshops.