Real-Time Upper Body Detection and 3D Pose Estimation in Monoscopic Images

This paper presents a novel solution to the difficult task of both detecting and estimating the 3D pose of humans in monoscopic images. The approach consists of two parts. Firstly the location of a human is identified by a probabalistic assembly of detected body parts. Detectors for the face, torso and hands are learnt using adaBoost. A pose likliehood is then obtained using an a priori mixture model on body configuration and possible configurations assembled from available evidence using RANSAC. Once a human has been detected, the location is used to initialise a matching algorithm which matches the silhouette and edge map of a subject with a 3D model. This is done efficiently using chamfer matching, integral images and pose estimation from the initial detection stage. We demonstrate the application of the approach to large, cluttered natural images and at near framerate operation (16fps) on lower resolution video streams.

[1]  Cordelia Schmid,et al.  Learning to Parse Pictures of People , 2002, ECCV.

[2]  William T. Freeman,et al.  Bayesian Reconstruction of 3D Human Motion from Single-Camera Video , 1999, NIPS.

[3]  Cordelia Schmid,et al.  Human Detection Based on a Probabilistic Assembly of Robust Part Detectors , 2004, ECCV.

[4]  Stephen J. McKenna,et al.  Human Pose Estimation Using Learnt Probabilistic Region Similarities and Partial Configurations , 2004, ECCV.

[5]  Daniel P. Huttenlocher,et al.  Efficient matching of pictorial structures , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[6]  Michael Isard,et al.  Attractive People: Assembling Loose-Limbed Models using Non-parametric Belief Propagation , 2003, NIPS.

[7]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[8]  Richard Bowden,et al.  View-based Location and Tracking of Body Parts for Visual Interaction , 2004, BMVC.

[9]  Tomaso A. Poggio,et al.  Example-Based Object Detection in Images by Components , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[11]  Takeo Kanade,et al.  Neural Network-Based Face Detection , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Björn Stenger,et al.  Hand Pose Estimation Using Hierarchical Detection , 2004, ECCV Workshop on HCI.

[13]  David A. Forsyth,et al.  Probabilistic Methods for Finding People , 2001, International Journal of Computer Vision.

[14]  Paul A. Viola,et al.  Robust Real-time Object Detection , 2001 .

[15]  Robert C. Bolles,et al.  Parametric Correspondence and Chamfer Matching: Two New Techniques for Image Matching , 1977, IJCAI.