Integration of bottom-up/top-down approaches for 2D pose estimation using probabilistic Gaussian modelling

In this paper, we address the recovery of human 2D postures from monocular image sequences. We propose a novel pose estimation framework which is based on the integration of probabilistic bottom-up and top-down processes which iteratively refine each other: foreground pixels are segmented using image cues whereas a hierarchical 2D body model fitting constraints body partitions. Its main advantages are twofold. First, the presented framework is activity-independent since it does not rely on learning any motion model. Secondly, we propose a confidence score indicating the quality of each estimated pose. Our study also reveals significant discrepancy between ground truth joint positions according to whether they are defined by humans or a motion capture system. Quantitative and qualitative results are presented on a variety of video sequences to validate our approach.

[1]  Dariu Gavrila,et al.  The Visual Analysis of Human Movement: A Survey , 1999, Comput. Vis. Image Underst..

[2]  Nicholas R. Howe,et al.  Recognition-Based Motion Capture and the HumanEva II Test Data , 2007, CVPR 2007.

[3]  A. Elgammal,et al.  Body Pose Tracking From Uncalibrated Camera Using Supervised Manifold Learning , 2006 .

[4]  David A. Forsyth,et al.  Finding and tracking people from the bottom up , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[5]  Ian D. Reid,et al.  Articulated Body Motion Capture by Stochastic Search , 2005, International Journal of Computer Vision.

[6]  Jitendra Malik,et al.  Recovering human body configurations: combining segmentation and recognition , 2004, CVPR 2004.

[7]  A. Elgammal,et al.  Inferring 3D body pose from silhouettes using activity manifold learning , 2004, CVPR 2004.

[8]  Andrew Zisserman,et al.  Self-Calibration from Image Triplets , 1996, ECCV.

[9]  Jean-Christophe Nebel,et al.  Exploiting Human Bipedal Motion Constraints for 3D Pose Recovery from a Single Uncalibrated Camera , 2009, VISAPP.

[10]  HiltonAdrian,et al.  A survey of advances in vision-based human motion capture and analysis , 2006 .

[11]  Jake K. Aggarwal,et al.  Simultaneous tracking of multiple body parts of interacting persons , 2006, Comput. Vis. Image Underst..

[12]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[13]  Jean-Christophe Nebel,et al.  Integration of Local Image Cues for Probabilistic 2D Pose Recovery , 2008, ISVC.

[14]  Jean-Christophe Nebel,et al.  Tracking Human Body Parts Using Particle Filters Constrained by Human Biomechanics , 2008, BMVC.

[15]  Michael J. Black,et al.  HumanEva: Synchronized Video and Motion Capture Dataset for Evaluation of Articulated Human Motion , 2006 .

[16]  Adrian Hilton,et al.  A survey of advances in vision-based human motion capture and analysis , 2006, Comput. Vis. Image Underst..

[17]  Andrew Blake,et al.  Articulated body motion capture by annealed particle filtering , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[18]  Odest Chadwicke Jenkins,et al.  Physical simulation for probabilistic motion tracking , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Stefano Soatto,et al.  Fast Human Pose Estimation using Appearance and Motion via Multi-Dimensional Boosting Regression , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[21]  Nicholas R. Howe Flow lookup and biological motion perception , 2005, IEEE International Conference on Image Processing 2005.

[22]  Rama Chellappa,et al.  Multicamera Tracking of Articulated Human Motion Using Shape and Motion Cues , 2009, IEEE Transactions on Image Processing.

[23]  Xianzhong Dai,et al.  A Robust Person Tracking and Following Approach for Mobile Robot , 2007, 2007 International Conference on Mechatronics and Automation.

[24]  Rama Chellappa,et al.  Model Driven Segmentation of Articulating Humans in Laplacian Eigenspace , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Ian D. Reid,et al.  Automatic partitioning of high dimensional search spaces associated with articulated body motion capture , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[26]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[27]  Daniel P. Huttenlocher,et al.  Beyond trees: common-factor models for 2D human pose recovery , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[28]  Ahmed M. Elgammal,et al.  Tracking People on a Torus , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[30]  Neil A. Thacker,et al.  Real-time Body Tracking Using a Gaussian Process Latent Variable Model , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[31]  J. Rodgers,et al.  Thirteen ways to look at the correlation coefficient , 1988 .

[32]  Jesús Martínez del Rincón,et al.  A spatio-temporal 2D-models framework for human pose recovery in monocular sequences , 2008, Pattern Recognit..

[33]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[34]  Rémi Ronfard,et al.  Action Recognition from Arbitrary Views using 3D Exemplars , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[35]  Ahmed M. Elgammal,et al.  Modeling View and Posture Manifolds for Tracking , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[36]  Cristian Sminchisescu,et al.  Discriminative density propagation for 3D human motion estimation , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[37]  Xu Zhao,et al.  Generative tracking of 3D human motion by hierarchical annealed genetic algorithm , 2008, Pattern Recognit..

[38]  Trevor Darrell,et al.  Sparse probabilistic regression for activity-independent human pose inference , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[39]  Mark S. Nixon,et al.  Gait-Based Pedestrian Detection for Automated Surveillance , 2007, ICVS 2007.

[40]  Shaogang Gong,et al.  Tracking colour objects using adaptive mixture models , 1999, Image Vis. Comput..

[41]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[42]  Thomas B. Moeslund,et al.  A Survey of Computer Vision-Based Human Motion Capture , 2001, Comput. Vis. Image Underst..

[43]  Deva Ramanan,et al.  Learning to parse images of articulated bodies , 2006, NIPS.

[44]  Hong Yan,et al.  Recovery of upper body poses in static images based on joints detection , 2009, Pattern Recognit. Lett..

[45]  Ahmed M. Elgammal,et al.  Inferring 3D body pose from silhouettes using activity manifold learning , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[46]  Jitendra Malik,et al.  Recovering human body configurations using pairwise constraints between parts , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[47]  Andrew M. Wallace,et al.  Evaluation of a hierarchical partitioned particle filter with action primitives , 2007, CVPR 2007.

[48]  Ronald Poppe,et al.  Evaluating Example-based Pose Estimation: Experiments on the HumanEva Sets , 2007 .

[49]  Michael J. Black,et al.  HumanEva: Synchronized Video and Motion Capture Dataset and Baseline Algorithm for Evaluation of Articulated Human Motion , 2010, International Journal of Computer Vision.

[50]  Sebastian Lang,et al.  Audiovisual Person Tracking with a Mobile Robot , 2004 .

[51]  Jianbo Shi,et al.  Bottom-up Recognition and Parsing of the Human Body , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[52]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[53]  Rui Li,et al.  Simultaneous Learning of Nonlinear Manifold and Dynamical Models for High-dimensional Time Series , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[54]  Alex Pentland,et al.  Pfinder: Real-Time Tracking of the Human Body , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[55]  Jake K. Aggarwal,et al.  Human Motion Analysis: A Review , 1999, Comput. Vis. Image Underst..

[56]  Ankur Agarwal,et al.  Recovering 3D human pose from monocular images , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[57]  Nicholas R. Howe,et al.  Silhouette Lookup for Automatic Pose Tracking , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[58]  Michael J. Black,et al.  A Quantitative Evaluation of Video-based 3D Person Tracking , 2005, 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.

[59]  Jean-Christophe Nebel,et al.  Camera auto-calibration from articulated motion , 2007, 2007 IEEE Conference on Advanced Video and Signal Based Surveillance.

[60]  Ronald R. Coifman,et al.  Diffusion Maps, Spectral Clustering and Eigenfunctions of Fokker-Planck Operators , 2005, NIPS.

[61]  John N. Carter,et al.  Towards pose invariant gait reconstruction , 2005, IEEE International Conference on Image Processing 2005.

[62]  Ramakant Nevatia,et al.  Bayesian human segmentation in crowded situations , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[63]  Cristian Sminchisescu,et al.  Kinematic jump processes for monocular 3D human tracking , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..