Real-Time 3D Body Pose Estimation

This chapter presents a novel approach to markerless real-time 3D pose estimation in a multi-camera setup. We explain how foreground-background segmentation and 3D reconstruction are used to extract a 3D hull of the user. This is done in real time using voxel carving and a fixed lookup table. The body pose is then retrieved using an example-based classifier that uses 3D Haar-like wavelet features to allow for real-time classification. Average neighborhood margin maximization (ANMM) is introduced as a powerful approach to train these Haar-like features.

[1]  Luc Van Gool,et al.  Full body tracking from multiple views using stochastic sampling , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[2]  Mohan M. Trivedi,et al.  Articulated body posture estimation from multi-camera voxel data , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[3]  E. Koller-Meier,et al.  Fast Body Posture Estimation using Volumetric Features , 2008, 2008 IEEE Workshop on Motion and video Computing.

[4]  Pascal Fua,et al.  Articulated Soft Objects for Video-based Body Modeling , 2001, ICCV.

[5]  Martial Hebert,et al.  Efficient visual event detection using volumetric features , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[6]  Luc Van Gool,et al.  GPU-Based Foreground-Background Segmentation using an Extended Colinearity Criterion , 2005 .

[7]  Kohji Fukunaga,et al.  Introduction to Statistical Pattern Recognition-Second Edition , 1990 .

[8]  Luc Van Gool,et al.  An adaptive color-based particle filter , 2003, Image Vis. Comput..

[9]  Richard Szeliski,et al.  Rapid octree construction from image sequences , 1993 .

[10]  Fei Wang,et al.  Feature Extraction by Maximizing the Average Neighborhood Margin , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Hans-Peter Seidel,et al.  Combining 2d Feature Tracking And Volume Reconstruction For Online Video-Based Human Motion Capture , 2004, Int. J. Image Graph..

[12]  Charles Q. Little,et al.  Real-Time Tracking of Articulated Human Models Using a 3D Shape-from-Silhouette Method , 2001, RobVis.

[13]  Ioannis A. Kakadiaris,et al.  Model-Based Estimation of 3D Human Motion , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Edmond Boyer,et al.  Exact polyhedral visual hulls , 2003, BMVC.

[15]  Tomaso A. Poggio,et al.  A general framework for object detection , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[16]  Mohan M. Trivedi,et al.  Human Body Model Acquisition and Tracking Using Voxel Data , 2003, International Journal of Computer Vision.

[17]  Arun K. Pujari,et al.  Volume intersection with optimal set of directions , 1991, Pattern Recognit. Lett..

[18]  Neil A. Thacker,et al.  The Bhattacharyya metric as an absolute similarity measure for frequency coded data , 1998, Kybernetika.

[19]  Michel Dhome,et al.  A 3D Shape Descriptor for Human Pose Recovery , 2008, AMDO.

[20]  Pavel Pudil,et al.  Introduction to Statistical Pattern Recognition , 2006 .

[21]  Paul A. Viola,et al.  Learning silhouette features for control of human motion , 2004, SIGGRAPH '04.

[22]  Isaac Cohen,et al.  Inference of human postures by classification of 3D human body shape , 2003, 2003 IEEE International SOI Conference. Proceedings (Cat. No.03CH37443).

[23]  Wojciech Matusik,et al.  Polyhedral Visual Hulls for Real-Time Rendering , 2001, Rendering Techniques.

[24]  A. Laurentini,et al.  The Visual Hull Concept for Silhouette-Based Image Understanding , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[25]  Kiriakos N. Kutulakos,et al.  A Theory of Shape by Space Carving , 2000, International Journal of Computer Vision.

[26]  Olivier D. Faugeras,et al.  3D articulated models and multi-view tracking with silhouettes , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[27]  Rómer Rosales,et al.  Specialized mappings and the estimation of human body pose from a single image , 2000, Proceedings Workshop on Human Motion.

[28]  Aldo Laurentini,et al.  How Many 2D Silhouettes Does It Take to Reconstruct a 3D Object? , 1997, Comput. Vis. Image Underst..

[29]  Larry S. Davis,et al.  3-D model-based tracking of humans in action: a multi-view approach , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[30]  Rémi Ronfard,et al.  Free viewpoint action recognition using motion history volumes , 2006, Comput. Vis. Image Underst..

[32]  Takeo Kanade,et al.  A real time system for robust 3D voxel reconstruction of human motions , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[33]  Jitendra Malik,et al.  Tracking people with twists and exponential maps , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[34]  Edmond Boyer,et al.  Real-Time Capture, Reconstruction and Insertion into Virtual World of Human Actors , 2003, VVG.

[35]  Paul A. Viola,et al.  Robust Real-time Object Detection , 2001 .

[36]  Steven M. Seitz,et al.  Photorealistic Scene Reconstruction by Voxel Coloring , 1997, International Journal of Computer Vision.

[37]  Til Aach,et al.  Illumination-Invariant Change Detection Using a Statistical Colinearity Criterion , 2001, DAGM-Symposium.

[38]  Toby Howard,et al.  Real-time markerless human body tracking using colored voxels and 3D blobs , 2004, Third IEEE and ACM International Symposium on Mixed and Augmented Reality.

[39]  Takeo Kanade,et al.  Shape-from-silhouette of articulated objects and its use for human body kinematics estimation and motion capture , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[40]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.