Human body pose detection using Bayesian spatio-temporal templates

We present a template-based approach to detecting human silhouettes in a specific walking pose. Our templates consist of short sequences of 2D silhouettes obtained from motion capture data. This lets us incorporate motion information into them and helps distinguish actual people who move in a predictable way from static objects whose outlines roughly resemble those of humans. Moreover, during the training phase we use statistical learning techniques to estimate and store the relevance of the different silhouette parts to the recognition task. At run-time, we use it to convert Chamfer distance to meaningful probability estimates. The templates can handle six different camera views, excluding the frontal and back view, as well as different scales. We demonstrate the effectiveness of our technique using both indoor and outdoor sequences of people walking in front of cluttered backgrounds and acquired with a moving camera, which makes techniques such as background subtraction impractical.

[1]  Dariu Gavrila,et al.  A Bayesian Framework for Multi-cue 3D Object Tracking , 2004, ECCV.

[2]  Stephen J. McKenna,et al.  Human Pose Estimation Using Learnt Probabilistic Region Similarities and Partial Configurations , 2004, ECCV.

[3]  James J. Little,et al.  A Boosted Particle Filter: Multitarget Detection and Tracking , 2004, ECCV.

[4]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[5]  Michael J. Black,et al.  Automatic Detection and Tracking of Human Motion with a View-Based Representation , 2002, ECCV.

[6]  Rómer Rosales,et al.  Inferring body pose without tracking body parts , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[7]  Stan Sclaroff,et al.  Estimating 3D hand pose from a cluttered image , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[8]  Björn Stenger,et al.  Filtering using a tree-based estimator , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[9]  Jitendra Malik,et al.  Recognizing action at a distance , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[10]  Ahmed M. Elgammal,et al.  Inferring 3D body pose from silhouettes using activity manifold learning , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[11]  Dariu Gavrila,et al.  Real-time object detection for "smart" vehicles , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[12]  D.M. Gavrila,et al.  Vision-based pedestrian detection: the PROTECTOR system , 2004, IEEE Intelligent Vehicles Symposium, 2004.

[13]  Daniel P. Huttenlocher,et al.  Pictorial Structures for Object Recognition , 2004, International Journal of Computer Vision.

[14]  Trevor Darrell,et al.  Fast pose estimation with parameter-sensitive hashing , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[15]  Bernt Schiele,et al.  Pedestrian detection in crowded scenes , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[16]  B. Triggs,et al.  3D human pose from silhouettes by relevance vector regression , 2004, CVPR 2004.

[17]  Ankur Agarwal,et al.  Learning to track 3D human motion from silhouettes , 2004, ICML.

[18]  Paul A. Viola,et al.  Detecting Pedestrians Using Patterns of Motion and Appearance , 2005, International Journal of Computer Vision.

[19]  Clark F. Olson,et al.  Automatic target recognition by matching oriented edge pixels , 1997, IEEE Trans. Image Process..

[20]  Jitendra Malik,et al.  Recovering human body configurations: combining segmentation and recognition , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[21]  Cordelia Schmid,et al.  Learning to Parse Pictures of People , 2002, ECCV.

[22]  Yanxi Liu,et al.  Bayesian body localization using mixture of nonlinear shape models , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[23]  Cordelia Schmid,et al.  Human Detection Based on a Probabilistic Assembly of Robust Part Detectors , 2004, ECCV.

[24]  B. Ripley,et al.  Robust Statistics , 2018, Wiley Series in Probability and Statistics.

[25]  Stefan Carlsson,et al.  Recognizing and Tracking Human Action , 2002, ECCV.