A Robust Framework for 2D Human Pose Tracking with Spatial and Temporal Constraints

We work on the task of 2D articulated human pose tracking in monocular image sequences, an extremely challenging task due to background cluttering, variation in body appearance, occlusion and imaging conditions. Most of current approaches only deal with simple appearance and adjacent body part dependencies, especially the Gaussian tree-structured priors assumed over body part connections. Such prior makes the part connections independent to image evidence and in turn severely limits accuracy. Building on the successful pictorial structures model, we propose a novel framework combining an image-conditioned model that incorporates higher order dependencies of multiple body parts. In order to establish the conditioning variables, we employ the effective poselet features. In addition to this, we introduce a full body detector as the first step of our framework to reduce the search space for pose tracking. We evaluate our framework on two challenging image sequences and conduct a series of comparison experiments to compare the performance with another two approaches. The results illustrate that the proposed framework in this work outperforms the state-of-the-art 2D pose tracking systems.

[1]  Peter V. Gehler,et al.  Poselet Conditioned Pictorial Structures , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Martin A. Fischler,et al.  The Representation and Matching of Pictorial Structures , 1973, IEEE Transactions on Computers.

[3]  Deva Ramanan,et al.  Learning to parse images of articulated bodies , 2006, NIPS.

[4]  Andrew Zisserman,et al.  Progressive search space reduction for human pose estimation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Ling Li,et al.  Human pose tracking based on both generic and specific appearance models , 2012, 2012 12th International Conference on Control Automation Robotics & Vision (ICARCV).

[6]  Prabhu Kaliamoorthi,et al.  Parametric annealing: A stochastic search method for human pose tracking , 2013, Pattern Recognit..

[7]  Andrew Zisserman,et al.  Tracking People by Learning Their Appearance , 2007 .

[8]  Bernt Schiele,et al.  Discriminative Appearance Models for Pictorial Structures , 2011, International Journal of Computer Vision.

[9]  Michael J. Black,et al.  HumanEva: Synchronized Video and Motion Capture Dataset and Baseline Algorithm for Evaluation of Articulated Human Motion , 2010, International Journal of Computer Vision.

[10]  Varun Ramakrishna,et al.  Tracking Human Pose by Tracking Symmetric Parts , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Deva Ramanan,et al.  Detecting Actions, Poses, and Objects with Relational Phraselets , 2012, ECCV.

[12]  Youfu Li,et al.  Robust visual tracking with structured sparse representation appearance model , 2012, Pattern Recognit..

[13]  Yang Song,et al.  Towards detection of human motion , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[14]  Trevor J. Hastie,et al.  Sparse Discriminant Analysis , 2011, Technometrics.

[15]  Yang Wang,et al.  Learning hierarchical poselets for human parsing , 2011, CVPR 2011.

[16]  Subhransu Maji,et al.  Detecting People Using Mutually Consistent Poselet Activations , 2010, ECCV.

[17]  Daniel P. Huttenlocher,et al.  A unified spatio-temporal articulated model for tracking , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[18]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Ling Li,et al.  Background Suppression for Building Accurate Appearance Models in Human Motion Tracking , 2012, 2012 International Conference on Digital Image Computing Techniques and Applications (DICTA).

[20]  Mark Everingham,et al.  Clustered Pose and Nonlinear Appearance Models for Human Pose Estimation , 2010, BMVC.

[21]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[22]  David A. Forsyth,et al.  Improved Human Parsing with a Full Relational Model , 2010, ECCV.

[23]  Yi Yang,et al.  Articulated pose estimation with flexible mixtures-of-parts , 2011, CVPR 2011.

[24]  Wanquan Liu,et al.  Multi-Scale Human Pose Tracking in 2D Monocular Images , 2014 .

[25]  Andrew Blake,et al.  "GrabCut" , 2004, ACM Trans. Graph..

[26]  Daniel P. Huttenlocher,et al.  Pictorial Structures for Object Recognition , 2004, International Journal of Computer Vision.

[27]  Abdul Kadir,et al.  Special Issue: Recent Advances in Plant Leaf Classification , 2015, CVPR 2015.

[28]  Andrew Zisserman,et al.  2D Articulated Human Pose Estimation and Retrieval in (Almost) Unconstrained Still Images , 2012, International Journal of Computer Vision.

[29]  Shimon Ullman,et al.  Using Linking Features in Learning Non-parametric Part Models , 2012, ECCV.