Computationally Efficient Regression on a Dependency Graph for Human Pose Estimation

We present a hierarchical method for human pose estimation from a single still image. In our approach, a dependency graph representing relationships between reference points such as body joints is constructed and the positions of these reference points are sequentially estimated by a successive application of multidimensional output regressions along the dependency paths, starting from the root node. Each regressor takes image features computed from an image patch centered on the current node's position estimated by the previous regressor and is specialized for estimating its child nodes' positions. The use of the dependency graph allows us to decompose a complex pose estimation problem into a set of local pose estimation problems that are less complex. We design a dependency graph for two commonly used human pose estimation datasets, the Buffy Stickmen dataset and the ETHZ PASCAL Stickmen dataset, and demonstrate that our method achieves comparable accuracy to state-of-the-art results on both datasets with significantly lower computation time than existing methods. Furthermore, we propose an importance weighted boosted regression trees method for transductive learning settings and demonstrate the resulting improved performance for pose estimation tasks.

[1]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[2]  Bernt Schiele,et al.  Discriminative Appearance Models for Pictorial Structures , 2011, International Journal of Computer Vision.

[3]  Sugiyama Masashi,et al.  Relative Density-Ratio Estimation for Robust Distribution Comparison , 2011 .

[4]  Ben Taskar,et al.  Cascaded Models for Articulated Pose Estimation , 2010, ECCV.

[5]  Silvio Savarese,et al.  An efficient branch-and-bound algorithm for optimal human pose estimation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[7]  Martin A. Fischler,et al.  The Representation and Matching of Pictorial Structures , 1973, IEEE Transactions on Computers.

[8]  Daniel P. Huttenlocher,et al.  Efficient matching of pictorial structures , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[9]  Makoto Yamada,et al.  No Bias Left behind: Covariate Shift Adaptation for Discriminative 3D Pose Estimation , 2012, ECCV.

[10]  B. Turlach Discussion of "Least Angle Regression" by Efron, Hastie, Johnstone and Tibshirani , 2004 .

[11]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[12]  Stefano Soatto,et al.  Fast Human Pose Estimation using Appearance and Motion via Multi-Dimensional Boosting Regression , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Bernt Schiele,et al.  Articulated people detection and pose estimation: Reshaping the future , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Jitendra Malik,et al.  Recovering human body configurations using pairwise constraints between parts , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[15]  Yi Yang,et al.  Articulated pose estimation with flexible mixtures-of-parts , 2011, CVPR 2011.

[16]  Kota Hara,et al.  Human pose estimation using patch-based candidate generation and model-based verification , 2011, Face and Gesture 2011.

[17]  Mun Wai Lee,et al.  Proposal maps driven MCMC for estimating human body pose in static images , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[18]  S. D. Jong SIMPLS: an alternative approach to partial least squares regression , 1993 .

[19]  Ankur Agarwal,et al.  A Local Basis Representation for Estimating Human Pose from Cluttered Images , 2006, ACCV.

[20]  Bernt Schiele,et al.  Robust Object Detection with Interleaved Categorization and Segmentation , 2008, International Journal of Computer Vision.

[21]  Roman Rosipal,et al.  Kernel Partial Least Squares Regression in Reproducing Kernel Hilbert Space , 2002, J. Mach. Learn. Res..

[22]  Andrew Zisserman,et al.  2D Articulated Human Pose Estimation and Retrieval in (Almost) Unconstrained Still Images , 2012, International Journal of Computer Vision.

[23]  Michael Arens,et al.  Human pose estimation with implicit shape models , 2010, ARTEMIS '10.

[24]  Norimichi Ukita Articulated pose estimation with parts connectivity using discriminative local oriented contours , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  B. Minasny The Elements of Statistical Learning, Second Edition, Trevor Hastie, Robert Tishirani, Jerome Friedman. (2009), Springer Series in Statistics, ISBN 0172-7397, 745 pp , 2009 .

[26]  Takafumi Kanamori,et al.  Relative Density-Ratio Estimation for Robust Distribution Comparison , 2011, Neural Computation.

[27]  Klaus Nordhausen,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition by Trevor Hastie, Robert Tibshirani, Jerome Friedman , 2009 .

[28]  Björn Stenger,et al.  Multivariate Relevance Vector Machines for Tracking , 2006, ECCV.

[29]  Michael J. Black,et al.  From Pictorial Structures to deformable structures , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[31]  D. H. D. West Updating mean and variance estimates: an improved method , 1979, CACM.