Bayesian Image Based 3D Pose Estimation

We introduce a 3D human pose estimation method from single image, based on a hierarchical Bayesian non-parametric model. The proposed model relies on a representation of the idiosyncratic motion of human body parts, which is captured by a subdivision of the human skeleton joints into groups. A dictionary of motion snapshots for each group is generated. The hierarchy ensures to integrate the visual features within the pose dictionary. Given a query image, the learned dictionary is used to estimate the likelihood of the group pose based on its visual features. The full-body pose is reconstructed taking into account the consistency of the connected group poses. The results show that the proposed approach is able to accurately reconstruct the 3D pose of previously unseen subjects.

[1]  Albert Y. Lo,et al.  On a Class of Bayesian Nonparametric Estimates: I. Density Estimates , 1984 .

[2]  Raveendran Paramesran,et al.  Single camera 3D human pose estimation: A Review of current techniques , 2009, 2009 International Conference for Technical Postgraduates (TECHPOS).

[3]  Ronald Poppe,et al.  Vision-based human motion analysis: An overview , 2007, Comput. Vis. Image Underst..

[4]  Wen Gao,et al.  Robust Estimation of 3D Human Poses from a Single Image , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Michael J. Black,et al.  Pose-conditioned joint angle limits for 3D human pose reconstruction , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Yi Yang,et al.  Articulated Human Detection with Flexible Mixtures of Parts , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  John W. Fisher,et al.  Parallel Sampling of DP Mixture Models using Sub-Cluster Splits , 2013, NIPS.

[8]  T. Ferguson A Bayesian Analysis of Some Nonparametric Problems , 1973 .

[9]  Xiaowei Zhou,et al.  Sparseness Meets Deepness: 3D Human Pose Estimation from Monocular Video , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Vijay Kumar,et al.  On the generation of smooth three-dimensional rigid body motions , 1998, IEEE Trans. Robotics Autom..

[11]  Fiora Pirri,et al.  Bayesian Non-parametric Inference for Manifold Based MoCap Representation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[12]  John W. Fisher,et al.  A Dirichlet Process Mixture Model for Spherical Data , 2015, AISTATS.

[13]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[14]  Francesc Moreno-Noguer,et al.  Single image 3D human pose estimation from noisy observations , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Bodo Rosenhahn,et al.  Posebits for Monocular Human Pose Estimation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Sebastian Nowozin,et al.  A Non-parametric Bayesian Network Prior of Human Pose , 2013, 2013 IEEE International Conference on Computer Vision.

[17]  Vincent Lepetit,et al.  Predicting People's 3D Poses from Short Sequences , 2015, ArXiv.

[18]  Jitendra Malik,et al.  Recovering 3D human body configurations using shape contexts , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Jia Xu,et al.  Manifold-valued Dirichlet Processes , 2015, ICML.

[20]  Dilan Görür,et al.  Nonparametric Bayesian discrete latent variable models for unsupervised learning , 2007 .

[21]  Radford M. Neal,et al.  A Split-Merge Markov chain Monte Carlo Procedure for the Dirichlet Process Mixture Model , 2004 .

[22]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[23]  Bernt Schiele,et al.  Monocular 3D pose estimation and tracking by detection , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[24]  Xiaogang Wang,et al.  Multi-source Deep Learning for Human Pose Estimation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Jitendra Malik,et al.  Shape matching and object recognition using shape contexts , 2010, 2010 3rd International Conference on Computer Science and Information Technology.

[26]  Antoni B. Chan,et al.  3D Human Pose Estimation from Monocular Images with Deep Convolutional Neural Network , 2014, ACCV.

[27]  Radford M. Neal Markov Chain Sampling Methods for Dirichlet Process Mixture Models , 2000 .

[28]  Fernando De la Torre,et al.  Spatio-temporal Matching for Human Detection in Video , 2014, ECCV.

[29]  Erik B. Sudderth Graphical models for visual object recognition and tracking , 2006 .

[30]  Michael J. Black,et al.  Predicting 3D People from 2D Pictures , 2006, AMDO.

[31]  Chun Chen,et al.  A survey of human pose estimation: The body parts parsing based methods , 2015, J. Vis. Commun. Image Represent..

[32]  P. Thomas Fletcher,et al.  Principal geodesic analysis for the study of nonlinear statistics of shape , 2004, IEEE Transactions on Medical Imaging.

[33]  Jonathan Tompson,et al.  Joint Training of a Convolutional Network and a Graphical Model for Human Pose Estimation , 2014, NIPS.

[34]  Juergen Gall,et al.  3D Pose Estimation from a Single Monocular Image , 2015, ArXiv.

[35]  Camillo J. Taylor,et al.  Reconstruction of articulated objects from point correspondences in a single uncalibrated image , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[36]  Xiaomin Duan,et al.  Riemannian Means on Special Euclidean Group and Unipotent Matrices Group , 2013, TheScientificWorldJournal.

[37]  Camilo Romero Núñez,et al.  Prevalence and risk factors associated with Toxocara canis infection in children. , 2013 .

[38]  Andrew W. Fitzgibbon,et al.  The Vitruvian manifold: Inferring dense correspondences for one-shot human pose estimation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[39]  Ankur Agarwal,et al.  Recovering 3D human pose from monocular images , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Michael I. Jordan,et al.  Hierarchical Dirichlet Processes , 2006 .

[41]  Ryan P. Adams,et al.  ClusterCluster: Parallel Markov Chain Monte Carlo for Dirichlet Process Mixtures , 2013, ArXiv.

[42]  Cristian Sminchisescu,et al.  Human3.6M: Large Scale Datasets and Predictive Methods for 3D Human Sensing in Natural Environments , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Andrew Zisserman,et al.  Representing shape with a spatial pyramid kernel , 2007, CIVR '07.

[44]  Christian Szegedy,et al.  DeepPose: Human Pose Estimation via Deep Neural Networks , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[45]  H. Karcher Riemannian center of mass and mollifier smoothing , 1977 .

[46]  T. Ferguson BAYESIAN DENSITY ESTIMATION BY MIXTURES OF NORMAL DISTRIBUTIONS , 1983 .

[47]  W. Kendall Probability, Convexity, and Harmonic Maps with Small Image I: Uniqueness and Fine Existence , 1990 .

[48]  René Vidal,et al.  On the Convergence of Gradient Descent for Finding the Riemannian Center of Mass , 2011, SIAM J. Control. Optim..