论文信息 - An adaptable system for RGB-D based human body detection and pose estimation

An adaptable system for RGB-D based human body detection and pose estimation

HighlightsDoes not require pre-processing by background subtraction and no initialization poses.Online learned appearance model combining color with depth-based labeling.Works in clutter and with body part occlusions because of underlying kinematic model.RDF training, data generation and cluster-based learning, that enables retraining. Human body detection and pose estimation is useful for a wide variety of applications and environments. Therefore a human body detection and pose estimation system must be adaptable and customizable. This paper presents such a system that extracts skeletons from RGB-D sensor data. The system adapts on-line to difficult unstructured scenes taken from a moving camera (since it does not require background subtraction) and benefits from using both color and depth data. It is customizable by virtue of requiring less training data, having a clearly described training method, and a customizable human kinematic model. Results show successful application to data from a moving camera in cluttered indoor environments. This system is open-source, encouraging reuse, comparison, and future research.

[1] Philip H. S. Torr,et al. Randomized trees for human pose detection , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[2] Andrew W. Fitzgibbon,et al. Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[3] Toby Sharp,et al. Real-time human pose recognition in parts from single depth images , 2011, CVPR.

[4] Michael Isard,et al. Tracking loose-limbed people , 2004, CVPR 2004.

[5] Ruigang Yang,et al. Accurate 3D pose estimation from a single depth image , 2011, 2011 International Conference on Computer Vision.

[6] Andrew S. Grimshaw,et al. High-Performance and Scalable GPU Graph Traversal , 2015, ACM Trans. Parallel Comput..

[7] G. Medioni,et al. Human pose estimation from a single view point , 2009 .

[8] Jitendra Malik,et al. Poselets: Body part detectors trained using 3D human pose annotations , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[9] Thomas B. Moeslund,et al. A Survey of Computer Vision-Based Human Motion Capture , 2001, Comput. Vis. Image Underst..

[10] Aaron Hertzmann,et al. Learning 3D mesh segmentation and labeling , 2010, SIGGRAPH 2010.

[11] Vincent Lepetit,et al. Randomized trees for real-time keypoint recognition , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[12] David A. Forsyth,et al. Finding and tracking people from the bottom up , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[13] Zhuowen Tu,et al. Auto-context and its application to high-level vision tasks , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[14] Andrew W. Fitzgibbon,et al. Efficient regression of general-activity human poses from depth images , 2011, 2011 International Conference on Computer Vision.

[15] Andrew Zisserman,et al. Humanising GrabCut: Learning to segment humans using the Kinect , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[16] Trevor Darrell,et al. Sparse probabilistic regression for activity-independent human pose inference , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[17] Andrew W. Fitzgibbon,et al. KinectFusion: real-time 3D reconstruction and interaction using a moving depth camera , 2011, UIST.

[18] Jorge Stolfi,et al. The image foresting transform: theory, algorithms, and applications , 2004 .

[19] Herman Bruyninckx,et al. On-line Generation of Customized Human Models based on Camera Measurements , 2011 .

[20] Michael J. Black,et al. Home 3D body scans from noisy image and range data , 2011, 2011 International Conference on Computer Vision.

[21] Ankur Agarwal,et al. 3D human pose from silhouettes by relevance vector regression , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[22] Ronald Poppe,et al. Vision-based human motion analysis: An overview , 2007, Comput. Vis. Image Underst..

[23] Sanjay Ghemawat,et al. MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[24] Lynn M. Cooper,et al. Come on in , 2005 .

[25] Kikuo Fujimura,et al. Constrained Optimization for Human Pose Estimation from Depth Sequences , 2007, ACCV.

[26] Sidharth Bhatia,et al. Tracking loose-limbed people , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[28] J. Ross Quinlan,et al. Induction of Decision Trees , 1986, Machine Learning.

[29] Adrian Hilton,et al. A survey of advances in vision-based human motion capture and analysis , 2006, Comput. Vis. Image Underst..

[30] Gérard G. Medioni,et al. Human pose estimation from a single view point, real-time range sensor , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[31] John Kenneth Salisbury,et al. Towards a personal robotics development platform: Rationale and design of an intrinsically safe personal robot , 2008, 2008 IEEE International Conference on Robotics and Automation.

[32] Stefano Soatto,et al. Relevant Feature Selection for Human Pose Estimation and Localization in Cluttered Images , 2008, ECCV.

[33] Toby Sharp,et al. Implementing Decision Trees and Forests on a GPU , 2008, ECCV.

[34] Sebastian Thrun,et al. Real time motion capture using a single time-of-flight camera , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[35] Yihong Gong,et al. Discriminative learning of visual words for 3D human pose estimation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[36] Bill Triggs,et al. Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[37] David A. Forsyth,et al. Probabilistic Methods for Finding People , 2001, International Journal of Computer Vision.

[38] Jitendra Malik,et al. Estimating Human Body Configurations Using Shape Context Matching , 2002, ECCV.

[39] Jörg Stückler,et al. Semantic mapping using object-class segmentation of RGB-D images , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[40] Sebastian Thrun,et al. Real-time identification and localization of body parts from depth images , 2010, 2010 IEEE International Conference on Robotics and Automation.

[41] Jos Vander Sloten,et al. Automatic Generation of Personalized Human Models based on Body Measurements , 2011 .

[42] Daniel P. Huttenlocher,et al. Pictorial Structures for Object Recognition , 2004, International Journal of Computer Vision.

[43] Roberto Cipolla,et al. Semantic texton forests for image categorization and segmentation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.