Fast Human Pose Estimation using Appearance and Motion via Multi-Dimensional Boosting Regression

We address the problem of estimating human pose in video sequences, where rough location has been determined. We exploit both appearance and motion information by defining suitable features of an image and its temporal neighbors, and learning a regression map to the parameters of a model of the human body using boosting techniques. Our algorithm can be viewed as a fast initialization step for human body trackers, or as a tracker itself. We extend gradient boosting techniques to learn a multi-dimensional map from (rotated and scaled) Haar features to the entire set of joint angles representing the full body pose. We test our approach by learning a map from image patches to body joint angles from synchronized video and motion capture walking data. We show how our technique enables learning an efficient real-time pose estimator, validated on publicly available datasets.

[1]  Alex Pentland,et al.  Pfinder: Real-Time Tracking of the Human Body , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Michael J. Black,et al.  Parameterized modeling and recognition of activities in temporal surfaces , 1999 .

[3]  Dariu Gavrila,et al.  The Visual Analysis of Human Movement: A Survey , 1999, Comput. Vis. Image Underst..

[4]  Michael J. Black,et al.  Parameterized Modeling and Recognition of Activities , 1999, Comput. Vis. Image Underst..

[5]  Daniel P. Huttenlocher,et al.  Efficient matching of pictorial structures , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[6]  Rómer Rosales,et al.  Learning Body Pose via Specialized Maps , 2001, NIPS.

[7]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[8]  Michael J. Black,et al.  Automatic Detection and Tracking of Human Motion with a View-Based Representation , 2002, ECCV.

[9]  J. Friedman Stochastic gradient boosting , 2002 .

[10]  Cordelia Schmid,et al.  Learning to Parse Pictures of People , 2002, ECCV.

[11]  Michael Isard,et al.  Attractive People: Assembling Loose-Limbed Models using Non-parametric Belief Propagation , 2003, NIPS.

[12]  Trevor Darrell,et al.  Fast pose estimation with parameter-sensitive hashing , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[13]  Paul A. Viola,et al.  Detecting Pedestrians Using Patterns of Motion and Appearance , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[14]  Trevor Darrell,et al.  Inferring 3D structure with a statistical image-based shape model , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[15]  Mun Wai Lee,et al.  Proposal maps driven MCMC for estimating human body pose in static images , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[16]  David A. Forsyth,et al.  Probabilistic Methods for Finding People , 2001, International Journal of Computer Vision.

[17]  Ankur Agarwal,et al.  3D human pose from silhouettes by relevance vector regression , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[18]  A. Elgammal,et al.  Inferring 3D body pose from silhouettes using activity manifold learning , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[19]  A. Torralba,et al.  Sharing features: efficient boosting procedures for multiclass object detection , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[20]  Michael Isard,et al.  CONDENSATION—Conditional Density Propagation for Visual Tracking , 1998, International Journal of Computer Vision.

[21]  Alexei A. Efros,et al.  Recovering human body configurations: combining segmentation and recognition , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[22]  M. Lee,et al.  Proposal maps driven MCMC for estimating human body pose in static images , 2004, CVPR 2004.

[23]  David A. Forsyth,et al.  Strike a pose: tracking people by finding stylized poses , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[24]  Paul A. Viola,et al.  Detecting Pedestrians Using Patterns of Motion and Appearance , 2005, International Journal of Computer Vision.

[25]  Gang Hua,et al.  Learning to estimate human pose with data driven belief propagation , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[26]  Dorin Comaniciu,et al.  Image based regression using boosting method , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[27]  Ankur Agarwal,et al.  Monocular Human Motion Capture with a Mixture of Regressors , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Workshops.

[28]  Michael J. Black,et al.  HumanEva: Synchronized Video and Motion Capture Dataset for Evaluation of Articulated Human Motion , 2006 .

[29]  Björn Stenger,et al.  Multivariate Relevance Vector Machines for Tracking , 2006, ECCV.