Using action classification for human-pose estimation

This paper presents a 3D-point-cloud system that extracts a 3D-point-cloud feature (VISH) from the observation of a depth sensor to reduce feature/depth ambiguity and estimates human poses using the result of action classification and a kinematic model. Based on the concept of distributed representation, a non-parametric action-mixture model is proposed in the system to represent high-dimensional human-pose space using low-dimensional manifolds in searching human poses. In each manifold, the probability distribution is estimated by the similarity of features. The distributions in the manifolds are then redistributed according to the stationary distribution of a Markov chain that models the frequency of actions. After the redistribution, the manifolds are combined according to the distribution determined by the action classification. In addition, the spatial relationship between human-body parts is explicitly modeled by a kinematic chain. Computer-simulation results showed that multiple low-dimensional manifolds can represent human-pose space. The 3D-point-cloud system showed reduction of the overall error and standard deviation compared with other approaches without using action classification.

[1]  Sebastian Thrun,et al.  Real time motion capture using a single time-of-flight camera , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[2]  M. R. Leadbetter Poisson Processes , 2011, International Encyclopedia of Statistical Science.

[3]  Yong Liu,et al.  Latent Gaussian Mixture Regression for Human Pose Estimation , 2010, ACCV.

[4]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[5]  Hans-Peter Seidel,et al.  A data-driven approach for real-time full body pose reconstruction from a depth camera , 2011, 2011 International Conference on Computer Vision.

[6]  Michael Isard,et al.  Tracking loose-limbed people , 2004, CVPR 2004.

[7]  Luc Van Gool,et al.  2D Action Recognition Serves 3D Human Pose Estimation , 2010, ECCV.

[8]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[9]  Luc Van Gool,et al.  Does Human Action Recognition Benefit from Pose Estimation? , 2011, BMVC.

[10]  Cheng-Kok Koh,et al.  A 3D-point-cloud feature for human-pose estimation , 2013, 2013 IEEE International Conference on Robotics and Automation.

[11]  Shane S. Sturrock,et al.  Time Warps, String Edits, and Macromolecules – The Theory and Practice of Sequence Comparison . David Sankoff and Joseph Kruskal. ISBN 1-57586-217-4. Price £13.95 (US$22·95). , 2000 .

[12]  Kevin P. Murphy,et al.  Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[13]  Ruigang Yang,et al.  Accurate 3D pose estimation from a single depth image , 2011, 2011 International Conference on Computer Vision.

[14]  Joseph B. Kruskal,et al.  Time Warps, String Edits, and Macromolecules , 1999 .

[15]  Michael Isard,et al.  Loose-limbed People: Estimating 3D Human Pose and Motion Using Non-parametric Belief Propagation , 2011, International Journal of Computer Vision.

[16]  Xiaofei He,et al.  Locality Preserving Projections , 2003, NIPS.

[17]  Luc Van Gool,et al.  Coupled Action Recognition and Pose Estimation from Multiple Views , 2012, International Journal of Computer Vision.