Holistic Human Pose Estimation with Regression Forests

In this work, we address the problem of human pose estimation in still images by proposing a holistic model for learning the appearance of the human body from image patches. These patches, which are randomly chosen, are used for extracting features and training a regression forest. During training, a mapping between image features and human poses, defined by joint offsets, is learned; while during prediction, the body joints are estimated with an efficient mode-seeking algorithm. In comparison to other holistic approaches, we can recover body poses from occlusion or noisy data. We demonstrate the power of our method in two publicly available datasets and propose a third one. Finally, we achieve state-of-the-art results in comparison to other approaches.

[1]  Jitendra Malik,et al.  Estimating Human Body Configurations Using Shape Context Matching , 2002, ECCV.

[2]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Luc Van Gool,et al.  Human Pose Estimation Using Body Parts Dependent Joint Regressors , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Mark Everingham,et al.  Learning effective human pose estimation from inaccurate annotation , 2011, CVPR 2011.

[5]  Trevor Darrell,et al.  Sparse probabilistic regression for activity-independent human pose inference , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Mandy Eberhart,et al.  Decision Forests For Computer Vision And Medical Image Analysis , 2016 .

[7]  Philip H. S. Torr,et al.  Randomized trees for human pose detection , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Dariu Gavrila,et al.  A Bayesian, Exemplar-Based Approach to Hierarchical Shape Matching , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Trevor Darrell,et al.  Fast pose estimation with parameter-sensitive hashing , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[10]  Yang Wang,et al.  Learning hierarchical poselets for human parsing , 2011, CVPR 2011.

[11]  Martin A. Fischler,et al.  The Representation and Matching of Pictorial Structures , 1973, IEEE Transactions on Computers.

[12]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[13]  Yi Yang,et al.  Articulated pose estimation with flexible mixtures-of-parts , 2011, CVPR 2011.

[14]  Vittorio Ferrari,et al.  Better Appearance Models for Pictorial Structures , 2009, BMVC.

[15]  Michael J. Black,et al.  From Pictorial Structures to deformable structures , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Thomas Deselaers,et al.  ClassCut for Unsupervised Class Segmentation , 2010, ECCV.

[17]  Andrew W. Fitzgibbon,et al.  Efficient regression of general-activity human poses from depth images , 2011, 2011 International Conference on Computer Vision.

[18]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[19]  Bernt Schiele,et al.  Discriminative Appearance Models for Pictorial Structures , 2011, International Journal of Computer Vision.

[20]  Ben Taskar,et al.  Cascaded Models for Articulated Pose Estimation , 2010, ECCV.

[21]  Mads Nielsen,et al.  Computer Vision — ECCV 2002 , 2002, Lecture Notes in Computer Science.

[22]  Bernt Schiele,et al.  Articulated people detection and pose estimation: Reshaping the future , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Adrian Hilton,et al.  Visual Analysis of Humans - Looking at People , 2013 .

[24]  Cristian Sminchisescu,et al.  Latent structured models for human pose estimation , 2011, 2011 International Conference on Computer Vision.

[25]  Peter V. Gehler,et al.  Poselet Conditioned Pictorial Structures , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Nassir Navab,et al.  3D Pictorial Structures for Multiple Human Pose Estimation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Ankur Agarwal,et al.  Recovering 3D human pose from monocular images , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Jitendra Malik,et al.  Poselets: Body part detectors trained using 3D human pose annotations , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[30]  Daniel P. Huttenlocher,et al.  Pictorial Structures for Object Recognition , 2004, International Journal of Computer Vision.

[31]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[32]  Bernt Schiele,et al.  Pictorial structures revisited: People detection and articulated pose estimation , 2009, CVPR.

[33]  Christoph Schnörr,et al.  A Study of Parts-Based Object Class Detection Using Complete Graphs , 2010, International Journal of Computer Vision.

[34]  Silvio Savarese,et al.  Articulated part-based model for joint object detection and pose estimation , 2011, 2011 International Conference on Computer Vision.

[35]  Hossein Azizpour,et al.  Multi-view Body Part Recognition with Random Forests , 2013, BMVC.