论文信息 - Capturing Hand Motion with an RGB-D Sensor, Fusing a Generative Model with Salient Points

Capturing Hand Motion with an RGB-D Sensor, Fusing a Generative Model with Salient Points

Hand motion capture has been an active research topic, following the success of full-body pose tracking. Despite similarities, hand tracking proves to be more challenging, characterized by a higher dimensionality, severe occlusions and self-similarity between fingers. For this reason, most approaches rely on strong assumptions, like hands in isolation or expensive multi-camera systems, that limit practical use. In this work, we propose a framework for hand tracking that can capture the motion of two interacting hands using only a single, inexpensive RGB-D camera. Our approach combines a generative model with collision detection and discriminatively learned salient points. We quantitatively evaluate our approach on 14 new sequences with challenging interactions.

Dimitrios Tzionas | Juergen Gall | Abhilash Srikantha | Pablo Aponte

[1] Gérard G. Medioni,et al. Object modeling by registration of multiple range images , 1991, Proceedings. 1991 IEEE International Conference on Robotics and Automation.

[2] Luc Van Gool,et al. Hough Forests for Object Detection, Tracking, and Action Recognition , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3] Bodo Rosenhahn,et al. Three-Dimensional Shape Knowledge for Joint Image Segmentation and Pose Tracking , 2007, International Journal of Computer Vision.

[4] Antonis A. Argyros,et al. Tracking the articulated motion of two strongly interacting hands , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[5] Jovan Popović,et al. Real-time hand-tracking with a color glove , 2009, SIGGRAPH 2009.

[6] Danica Kragic,et al. Grasp Recognition for Programming by Demonstration , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.

[7] James M. Rehg,et al. Statistical Color Models with Application to Skin Detection , 2004, International Journal of Computer Vision.

[8] Jitendra Malik,et al. Twist Based Acquisition and Tracking of Animal and Human Kinematics , 2004, International Journal of Computer Vision.

[9] Nassir Navab,et al. Adaptive neighborhood selection for real-time surface normal estimation from organized point cloud data using integral images , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[10] Björn Stenger,et al. Shape context and chamfer matching in cluttered scenes , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[11] Dimitrios Tzionas,et al. A Comparison of Directional Distances for Hand Pose Estimation , 2013, GCPR.

[12] Marc Levoy,et al. Real-time 3D model acquisition , 2002, ACM Trans. Graph..

[13] John F. Canny,et al. A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14] Luc Van Gool,et al. Motion Capture of Hands in Action Using Discriminative Salient Points , 2012, ECCV.

[15] Luc Van Gool,et al. Tracking a hand manipulating an object , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[16] John P. Lewis,et al. Pose Space Deformation: A Unified Approach to Shape Interpolation and Skeleton-Driven Deformation , 2000, SIGGRAPH.

[17] David W. Murray,et al. Regression-based Hand Pose Estimation from Multiple Cameras , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[18] Daniel P. Huttenlocher,et al. Distance Transforms of Sampled Functions , 2012, Theory Comput..

[19] Frédo Durand,et al. A Fast Approximation of the Bilateral Filter Using a Signal Processing Approach , 2006, International Journal of Computer Vision.

[20] Antonis A. Argyros,et al. Full DOF tracking of a hand interacting with an object by modeling occlusions and physical constraints , 2011, 2011 International Conference on Computer Vision.

[21] Antonis A. Argyros,et al. Efficient model-based 3D tracking of hand articulations using Kinect , 2011, BMVC.

[22] Adrian Hilton,et al. A survey of advances in vision-based human motion capture and analysis , 2006, Comput. Vis. Image Underst..

[23] Stan Sclaroff,et al. 3D hand pose reconstruction using specialized mappings , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[24] Mircea Nicolescu,et al. Vision-based hand pose estimation: A review , 2007, Comput. Vis. Image Underst..

[25] Luc Van Gool,et al. Functional categorization of objects using real-time markerless motion capture , 2011, CVPR 2011.

[26] Bodo Rosenhahn,et al. Model-Based Pose Estimation , 2011, Visual Analysis of Humans.

[27] Takeo Kanade,et al. Visual Tracking of High DOF Articulated Structures: an Application to Human Hand Tracking , 1994, ECCV.

[28] Patrick Olivier,et al. Digits: freehand 3D interactions anywhere using a wrist-worn gloveless sensor , 2012, UIST.

[29] Danica Kragic,et al. Hands in action: real-time 3D reconstruction of hands in interaction with objects , 2010, 2010 IEEE International Conference on Robotics and Automation.

[30] Hans-Peter Seidel,et al. Eurographics/siggraph Symposium on Computer Animation (2003) Construction and Animation of Anatomically Based Human Hand Models , 2022 .

[31] Mohammad Ali Nekouie. 3D Human Hand Posture Reconstruction Using a Single 2D Image , 2011 .

[32] Stan Sclaroff,et al. Estimating 3D hand pose from a cluttered image , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[33] Richard M. Murray,et al. A Mathematical Introduction to Robotic Manipulation , 1994 .

[34] Paulo R. S. Mendonça,et al. Model-based 3D tracking of an articulated hand , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[35] Toby Sharp,et al. Real-time human pose recognition in parts from single depth images , 2011, CVPR.

[36] Marc Levoy,et al. Efficient variants of the ICP algorithm , 2001, Proceedings Third International Conference on 3-D Digital Imaging and Modeling.

[37] Danica Kragic,et al. Monocular real-time 3D articulated hand pose estimation , 2009, 2009 9th IEEE-RAS International Conference on Humanoid Robots.

[38] Antonis A. Argyros,et al. Physically Plausible 3D Scene Tracking: The Single Actor Hypothesis , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[39] Takeo Kanade,et al. Model-based tracking of self-occluding articulated objects , 1995, Proceedings of IEEE International Conference on Computer Vision.

[40] Luc Van Gool,et al. An object-dependent hand pose prior from sparse training data , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[41] Gabriel Zachmann,et al. Collision Detection for Deformable Objects , 2004, Comput. Graph. Forum.

[42] Jitendra Malik,et al. Shape matching and object recognition using shape contexts , 2010, 2010 3rd International Conference on Computer Science and Information Technology.

[43] Jovan Popovic,et al. Automatic rigging and animation of 3D characters , 2007, ACM Trans. Graph..

[44] David C. Hogg,et al. Towards 3D hand tracking using a deformable model , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[45] Michael Isard,et al. Partitioned Sampling, Articulated Objects, and Interface-Quality Hand Tracking , 2000, ECCV.

[46] Antti Oulasvirta,et al. Interactive Markerless Articulated Hand Motion Tracking Using RGB and Depth Data , 2013, 2013 IEEE International Conference on Computer Vision.

[47] Lale Akarun,et al. Hand Pose Estimation and Hand Shape Classification Using Multi-layered Randomized Decision Forests , 2012, ECCV.