A survey of human pose estimation: The body parts parsing based methods

Summarization of methods on human pose estimation in recent years.Conclusion of the traditional human pose estimation methods.Illustrated based on a two-stage framework.Comprehensive comparisons are given based on the open source methods. Estimating human pose from videos and image sequences is not only an important computer vision problem, but also plays very critical role in many real-world applications. Main challenges for human pose estimation are variation of body poses, complicated background and depth ambiguities. To solve these problems, considerable research efforts have been devoted to the related fields. In this survey, we focus our attention on the recent advances in vision-based human pose estimation. We first present a general framework of human pose estimation, and then go through the latest technical progress on each stage. Finally, we discuss the limitations of the existing approaches and foresee the future directions to be explored.

[1]  Sebastian Thrun,et al.  Real time motion capture using a single time-of-flight camera , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[2]  Sebastian Thrun,et al.  Real-time identification and localization of body parts from depth images , 2010, 2010 IEEE International Conference on Robotics and Automation.

[3]  Andrew W. Fitzgibbon,et al.  The Vitruvian manifold: Inferring dense correspondences for one-shot human pose estimation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Michael J. Black,et al.  HumanEva: Synchronized Video and Motion Capture Dataset and Baseline Algorithm for Evaluation of Articulated Human Motion , 2010, International Journal of Computer Vision.

[5]  Jitendra Malik,et al.  Recovering 3D human body configurations using shape contexts , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Vittorio Ferrari,et al.  Better Appearance Models for Pictorial Structures , 2009, BMVC.

[7]  Hans-Peter Seidel,et al.  Fast articulated motion tracking using a sums of Gaussians body model , 2011, 2011 International Conference on Computer Vision.

[8]  Xiaogang Wang,et al.  Multi-source Deep Learning for Human Pose Estimation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Alan L. Yuille,et al.  Articulated Pose Estimation by a Graphical Model with Image Dependent Pairwise Relations , 2014, NIPS.

[10]  Cristian Sminchisescu 3D Human Motion Analysis in Monocular Video Techniques and Challenges , 2006, AVSS.

[11]  Cordelia Schmid,et al.  Mixing Body-Part Sequences for Human Pose Estimation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Jean-Marc Lavest,et al.  Background subtraction adapted to PTZ cameras by keypoint density estimation , 2010, BMVC.

[13]  Hossein Azizpour,et al.  Multi-view Body Part Recognition with Random Forests , 2013, BMVC.

[14]  Rémi Ronfard,et al.  Free viewpoint action recognition using motion history volumes , 2006, Comput. Vis. Image Underst..

[15]  Ronald Poppe,et al.  Vision-based human motion analysis: An overview , 2007, Comput. Vis. Image Underst..

[16]  Silvio Savarese,et al.  Articulated part-based model for joint object detection and pose estimation , 2011, 2011 International Conference on Computer Vision.

[17]  Roberto Cipolla,et al.  Hierarchical Part-Based Human Body Pose Estimation , 2005, BMVC.

[18]  Yi Yang,et al.  Recognizing proxemics in personal photos , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Mohammed Bennamoun,et al.  Localized fusion of Shape and Appearance features for 3D Human Pose Estimation , 2010, BMVC.

[20]  Yi Yang,et al.  Learning a 3D Human Pose Distance Metric from Geometric Pose Descriptor , 2011, IEEE Transactions on Visualization and Computer Graphics.

[21]  Sebastian Thrun,et al.  Unsupervised learning of invariant features using video , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[22]  Ling Shao,et al.  Enhanced Computer Vision With Microsoft Kinect Sensor: A Review , 2013, IEEE Transactions on Cybernetics.

[23]  Peter V. Gehler,et al.  Human Pose Estimation with Fields of Parts , 2014, ECCV.

[24]  Ligang Liu,et al.  Scanning 3D Full Human Bodies Using Kinects , 2012, IEEE Transactions on Visualization and Computer Graphics.

[25]  Frank Weichert,et al.  Analysis of the Accuracy and Robustness of the Leap Motion Controller , 2013, Sensors.

[26]  Michael J. Black,et al.  A 2D Human Body Model Dressed in Eigen Clothing , 2010, ECCV.

[27]  Rui Li,et al.  3D Human Motion Tracking with a Coordinated Mixture of Factor Analyzers , 2009, International Journal of Computer Vision.

[28]  Naoufel Werghi,et al.  Segmentation and Modeling of Full Human Body Shape From 3-D Scan Data: A Survey , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[29]  Silvio Savarese,et al.  Breaking the Chain: Liberation from the Temporal Markov Assumption for Tracking Human Poses , 2013, 2013 IEEE International Conference on Computer Vision.

[30]  Reinhard Klette,et al.  Understanding Human Motion: A Historic Review , 2006, Human Motion.

[31]  Andrew Zisserman,et al.  2D Articulated Human Pose Estimation and Retrieval in (Almost) Unconstrained Still Images , 2012, International Journal of Computer Vision.

[32]  Andrew Zisserman,et al.  Progressive search space reduction for human pose estimation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Luc Van Gool,et al.  Human Pose Estimation Using Body Parts Dependent Joint Regressors , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Aaron Hertzmann,et al.  Style-based inverse kinematics , 2004, SIGGRAPH 2004.

[35]  Vittorio Ferrari,et al.  Fast Object Segmentation in Unconstrained Video , 2013, 2013 IEEE International Conference on Computer Vision.

[36]  Cordelia Schmid,et al.  Estimating Human Pose with Flowing Puppets , 2013, 2013 IEEE International Conference on Computer Vision.

[37]  Peter V. Gehler,et al.  Strong Appearance and Expressive Spatial Models for Human Pose Estimation , 2013, 2013 IEEE International Conference on Computer Vision.

[38]  Marc Pollefeys,et al.  Foreground Consistent Human Pose Estimation Using Branch and Bound , 2014, ECCV.

[39]  Fraser Anderson,et al.  Lean on Wii: physical rehabilitation with virtual reality Wii peripherals. , 2010, Studies in health technology and informatics.

[40]  Fei-Fei Li,et al.  Modeling mutual context of object and human pose in human-object interaction activities , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[41]  Adrian Hilton,et al.  Visual Analysis of Humans - Looking at People , 2013 .

[42]  Yu Chen,et al.  Inferring 3D Shapes and Deformations from Single Views , 2010, ECCV.

[43]  Jinxiang Chai,et al.  Modeling 3D human poses from uncalibrated monocular images , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[44]  Ronald Poppe,et al.  A survey on vision-based human action recognition , 2010, Image Vis. Comput..

[45]  Varun Ramakrishna,et al.  Pose Machines: Articulated Pose Estimation via Inference Machines , 2014, ECCV.

[46]  J.K. Aggarwal,et al.  Human activity analysis , 2011, ACM Comput. Surv..

[47]  Yang Wang,et al.  Learning hierarchical poselets for human parsing , 2011, CVPR 2011.

[48]  Ben Taskar,et al.  MODEC: Multimodal Decomposable Models for Human Pose Estimation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[49]  Bernt Schiele,et al.  Pictorial structures revisited: People detection and articulated pose estimation , 2009, CVPR.

[50]  Tomás Pajdla,et al.  Simultaneous surveillance camera calibration and foot-head homology estimation from human detections , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[51]  Cristian Sminchisescu,et al.  Latent structured models for human pose estimation , 2011, 2011 International Conference on Computer Vision.

[52]  Daniel P. Huttenlocher,et al.  Pictorial Structures for Object Recognition , 2004, International Journal of Computer Vision.

[53]  Jianbo Shi,et al.  Bottom-up Recognition and Parsing of the Human Body , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[54]  Raquel Urtasun,et al.  Implicitly Constrained Gaussian Process Regression for Monocular Non-Rigid Pose Estimation , 2010, NIPS.

[55]  Ben Taskar,et al.  Adaptive pose priors for pictorial structures , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[56]  Vittorio Ferrari,et al.  We Are Family: Joint Pose Estimation of Multiple Persons , 2010, ECCV.

[57]  Yi Li,et al.  Beyond Physical Connections: Tree Models in Human Pose Estimation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[58]  Daphne Koller,et al.  Multi-level inference by relaxed dual decomposition for human pose segmentation , 2011, CVPR 2011.

[59]  David A. Forsyth,et al.  Improved Human Parsing with a Full Relational Model , 2010, ECCV.

[60]  Yi Yang,et al.  Articulated pose estimation with flexible mixtures-of-parts , 2011, CVPR 2011.

[61]  Toby Sharp,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR.

[62]  Bernt Schiele,et al.  2D Human Pose Estimation: New Benchmark and State of the Art Analysis , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[63]  Cristian Sminchisescu,et al.  Twin Gaussian Processes for Structured Prediction , 2010, International Journal of Computer Vision.

[64]  Ben Taskar,et al.  Cascaded Models for Articulated Pose Estimation , 2010, ECCV.

[65]  Carlos Sagüés,et al.  Human-Computer Interaction Based on Hand Gestures Using RGB-D Sensors , 2013, Sensors.

[66]  Horst Bischof,et al.  Skeletal Graph Based Human Pose Estimation in Real-Time , 2011, BMVC.

[67]  Carol O'Sullivan,et al.  Seeing is believing: body motion dominates in multisensory conversations , 2010, SIGGRAPH 2010.

[68]  Yong Jae Lee,et al.  Key-segments for video object segmentation , 2011, 2011 International Conference on Computer Vision.

[69]  Jonathan Tompson,et al.  Joint Training of a Convolutional Network and a Graphical Model for Human Pose Estimation , 2014, NIPS.

[70]  Christian Wolf,et al.  Human body part estimation from depth images via spatially-constrained deep learning , 2014, Pattern Recognition Letters.

[71]  Stefan Carlsson,et al.  3D Pictorial Structures for Multiple View Articulated Pose Estimation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[72]  Jitendra Malik,et al.  Poselets: Body part detectors trained using 3D human pose annotations , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[73]  Joris De Schutter,et al.  An adaptable system for RGB-D based human body detection and pose estimation , 2014, J. Vis. Commun. Image Represent..

[74]  Yi Yang,et al.  Articulated Human Detection with Flexible Mixtures of Parts , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[75]  Yaser Sheikh,et al.  Motion capture from body-mounted cameras , 2011, SIGGRAPH 2011.

[76]  Manolya Kavakli,et al.  Real Time Six Degree of Freedom Pose Estimation Using Infrared Light Sources and Wiimote IR Camera with 3D TV Demonstration , 2010, MobiQuitous.

[77]  David A. Forsyth,et al.  Human parsing with a cascade of hierarchical poselet based pruners , 2014, 2014 IEEE International Conference on Multimedia and Expo (ICME).

[78]  T. Kanade,et al.  Reconstructing 3D Human Pose from 2D Image Landmarks , 2012, ECCV.

[79]  Deva Ramanan,et al.  Detecting Actions, Poses, and Objects with Relational Phraselets , 2012, ECCV.

[80]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[81]  Alan L. Yuille,et al.  Adaptive occlusion state estimation for human pose tracking under self-occlusions , 2013, Pattern Recognit..

[82]  Prabhu Kaliamoorthi,et al.  Parametric annealing: A stochastic search method for human pose tracking , 2013, Pattern Recognit..

[83]  Raveendran Paramesran,et al.  Single camera 3D human pose estimation: A Review of current techniques , 2009, 2009 International Conference for Technical Postgraduates (TECHPOS).

[84]  Peter V. Gehler,et al.  Poselet Conditioned Pictorial Structures , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[85]  Martin A. Fischler,et al.  The Representation and Matching of Pictorial Structures , 1973, IEEE Transactions on Computers.

[86]  Kristen Grauman,et al.  Boundary preserving dense local regions , 2011, CVPR 2011.

[87]  Ben Taskar,et al.  Learning Adaptive Value of Information for Structured Prediction , 2013, NIPS.

[88]  Ben Taskar,et al.  Parsing human motion with stretchable models , 2011, CVPR 2011.

[89]  Jitendra Malik,et al.  Articulated Pose Estimation Using Discriminative Armlet Classifiers , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[90]  A. Fathi,et al.  Human Pose Estimation using Motion Exemplars , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[91]  Meinard Müller,et al.  Full-Body Human Motion Capture from Monocular Depth Images , 2013, Time-of-Flight and Depth Imaging.

[92]  Mark Everingham,et al.  Clustered Pose and Nonlinear Appearance Models for Human Pose Estimation , 2010, BMVC.

[93]  Hans-Peter Seidel,et al.  Personalization and Evaluation of a Real-Time Depth-Based Full Body Tracker , 2013, 2013 International Conference on 3D Vision.

[94]  Andrew Blake,et al.  Efficient Human Pose Estimation from Single Depth Images , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[95]  Mark Everingham,et al.  Learning effective human pose estimation from inaccurate annotation , 2011, CVPR 2011.

[96]  Luc Van Gool,et al.  2D Action Recognition Serves 3D Human Pose Estimation , 2010, ECCV.

[97]  Christian Szegedy,et al.  DeepPose: Human Pose Estimation via Deep Neural Networks , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[98]  Deva Ramanan,et al.  Learning to parse images of articulated bodies , 2006, NIPS.

[99]  Sebastian Thrun,et al.  Real-Time Human Pose Tracking from Range Data , 2012, ECCV.

[100]  Shimon Ullman,et al.  Using Linking Features in Learning Non-parametric Part Models , 2012, ECCV.