Localized fusion of Shape and Appearance features for 3D Human Pose Estimation

This paper presents a learning-based method for combining the shape and appearance feature types for 3D human pose estimation from single-view images. Our method is based on clustering the 3D pose space into several modular regions and learning the regressors for both feature types and their optimal fusion scenario in each region. This way the complementary information of the individual feature types is exploited, leading to improved performance of pose estimation. We train and evaluate our method using a synchronized video and 3D motion dataset. Our experimental results show that the proposed feature combination method gave more accurate pose estimation than that from each individual feature type.

[1]  Stefano Soatto,et al.  Relevant Feature Selection for Human Pose Estimation and Localization in Cluttered Images , 2008, ECCV.

[2]  Ronald Poppe,et al.  Evaluating Example-based Pose Estimation: Experiments on the HumanEva Sets , 2007 .

[3]  Ankur Agarwal,et al.  Monocular Human Motion Capture with a Mixture of Regressors , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Workshops.

[4]  Michael E. Tipping The Relevance Vector Machine , 1999, NIPS.

[5]  AgarwalAnkur,et al.  Recovering 3D Human Pose from Monocular Images , 2006 .

[6]  Ankur Agarwal,et al.  Recovering 3D human pose from monocular images , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Mun Wai Lee,et al.  A model-based approach for estimating human 3D poses in static images , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  I. Haritaoglu,et al.  Background and foreground modeling using nonparametric kernel density estimation for visual surveillance , 2002 .

[9]  Michael J. Black,et al.  HumanEva: Synchronized Video and Motion Capture Dataset for Evaluation of Articulated Human Motion , 2006 .

[10]  Cristian Sminchisescu,et al.  Discriminative density propagation for 3D human motion estimation , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[11]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[12]  Ian D. Reid,et al.  Articulated Body Motion Capture by Stochastic Search , 2005, International Journal of Computer Vision.

[13]  Mohammed Bennamoun,et al.  Context-Based Appearance Descriptor for 3D Human Pose Estimation from Monocular Images , 2009, 2009 Digital Image Computing: Techniques and Applications.

[14]  Michael E. Tipping,et al.  Fast Marginal Likelihood Maximisation for Sparse Bayesian Models , 2003 .