Real-time identification and localization of body parts from depth images

We deal with the problem of detecting and identifying body parts in depth images at video frame rates. Our solution involves a novel interest point detector for mesh and range data that is particularly well suited for analyzing human shape. The interest points, which are based on identifying geodesic extrema on the surface mesh, coincide with salient points of the body, which can be classified as, e.g., hand, foot or head using local shape descriptors. Our approach also provides a natural way of estimating a 3D orientation vector for a given interest point. This can be used to normalize the local shape descriptors to simplify the classification problem as well as to directly estimate the orientation of body parts in space. Experiments involving ground truth labels acquired via an active motion capture system show that our interest points in conjunction with a boosted patch classifier are significantly better in detecting body parts in depth images than state-of-the-art sliding-window based detectors.

[1]  Antonio Torralba,et al.  Sharing features: efficient boosting procedures for multiclass object detection , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[2]  Adrian Hilton,et al.  A survey of advances in vision-based human motion capture and analysis , 2006, Comput. Vis. Image Underst..

[3]  Jing Hua,et al.  Salient spectral geometric features for shape matching and retrieval , 2009, The Visual Computer.

[4]  Wolfram Burgard,et al.  Unsupervised learning of 3D object models from partial views , 2009, 2009 IEEE International Conference on Robotics and Automation.

[5]  Bruce L. Golden,et al.  Technical Note - Shortest-Path Algorithms: A Comparison , 1976, Oper. Res..

[6]  B. Golden Shortest-Path Algorithms: A Comparison , 1975 .

[7]  Michael Lehmann,et al.  An all-solid-state optical range camera for 3D real-time imaging with sub-centimeter depth resolution (SwissRanger) , 2004, SPIE Optical Systems Design.

[8]  Tsukasa Ogasawara,et al.  Humanoid with Interaction Ability Using Vision and Speech Information , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[9]  Wolfram Burgard,et al.  Robust on-line model-based object detection from range images , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[10]  Rainer Stiefelhagen,et al.  Visual recognition of pointing gestures for human-robot interaction , 2007, Image Vis. Comput..

[11]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[12]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[13]  Hong Qin,et al.  Surface matching with salient keypoints in geodesic scale space , 2008, Comput. Animat. Virtual Worlds.