Detail-preserved real-time hand motion regression from depth

This paper aims to address the very challenging problem of efficient and accurate hand tracking from depth sequences, meanwhile to deform a high-resolution 3D hand model with geometric details. We propose an integrated regression framework to infer articulated hand pose, and regress high-frequency details from sparse high-resolution 3D hand model examples. Specifically, our proposed method mainly consists of four components: skeleton embedding, hand joint regression, skeleton alignment, and high-resolution details integration. Skeleton embedding is optimized via a wrinkle-based skeleton refinement method for faithful hand models with fine geometric details. Hand joint regression is based on a deep convolutional network, from which 3D hand joint locations are predicted from a single depth map, then a skeleton alignment stage is performed to recover fully articulated hand poses. Deformable fine-scale details are estimated from a nonlinear mapping between the hand joints and per-vertex displacements. Experiments on two challenging datasets show that our proposed approach can achieve accurate, robust, and real-time hand tracking, while preserve most high-frequency details when deforming a virtual hand.

[1]  Vincent Lepetit,et al.  Training a Feedback Loop for Hand Pose Estimation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[2]  Yi Yang,et al.  Depth-Based Hand Pose Estimation: Methods, Data, and Challenges , 2015, International Journal of Computer Vision.

[3]  Andrew W. Fitzgibbon,et al.  Real-time non-rigid reconstruction using an RGB-D camera , 2014, ACM Trans. Graph..

[4]  Helge J. Ritter,et al.  Real-time hand tracking using synergistic inverse kinematics , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[5]  Antti Oulasvirta,et al.  Interactive Markerless Articulated Hand Motion Tracking Using RGB and Depth Data , 2013, 2013 IEEE International Conference on Computer Vision.

[6]  Yong Hu,et al.  Simple very deep convolutional network for robust hand pose regression from a single depth image , 2017, Pattern Recognit. Lett..

[7]  Andrea Tagliasacchi,et al.  Robust Articulated-ICP for Real-Time Hand Tracking , 2015 .

[8]  Antonis A. Argyros,et al.  Tracking the articulated motion of two strongly interacting hands , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Jovan Popović,et al.  Real-time hand-tracking with a color glove , 2009, SIGGRAPH 2009.

[10]  David C. Hogg,et al.  Towards 3D hand tracking using a deformable model , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[11]  Yichen Wei,et al.  Model-Based Deep Hand Pose Estimation , 2016, IJCAI.

[12]  Antonis A. Argyros,et al.  Efficient model-based 3D tracking of hand articulations using Kinect , 2011, BMVC.

[13]  Ken Perlin,et al.  Real-Time Continuous Pose Recovery of Human Hands Using Convolutional Networks , 2014, ACM Trans. Graph..

[14]  Vincent Lepetit,et al.  Hands Deep in Deep Learning for Hand Pose Estimation , 2015, ArXiv.

[15]  Yizhou Yu,et al.  Detail-Preserving Controllable Deformation from Sparse Examples , 2012, IEEE Transactions on Visualization and Computer Graphics.

[16]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Leonidas J. Guibas,et al.  Robust single-view geometry and motion reconstruction , 2009, ACM Trans. Graph..

[18]  Andrew W. Fitzgibbon,et al.  The Vitruvian manifold: Inferring dense correspondences for one-shot human pose estimation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Daniel Thalmann,et al.  Robust 3D Hand Pose Estimation in Single Depth Images: From Single-View CNN to Multi-View CNNs , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Paulo R. S. Mendonça,et al.  Model-based 3D tracking of an articulated hand , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[21]  Andrea Tagliasacchi,et al.  Sphere-meshes for real-time hand modeling and tracking , 2016, ACM Trans. Graph..

[22]  Qionghai Dai,et al.  Robust Non-rigid Motion Tracking and Surface Reconstruction Using L0 Regularization , 2015, ICCV.

[23]  Andrew W. Fitzgibbon,et al.  Efficient and precise interactive hand tracking through joint, continuous optimization of pose and correspondences , 2016, ACM Trans. Graph..

[24]  Michael Garland,et al.  Surface simplification using quadric error metrics , 1997, SIGGRAPH.

[25]  Jian Sun,et al.  Cascaded hand pose regression , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Ilya Baran,et al.  Automatic rigging and animation of 3D characters , 2007, SIGGRAPH 2007.

[27]  Li Cheng,et al.  Efficient Hand Pose Estimation from a Single Depth Image , 2013, 2013 IEEE International Conference on Computer Vision.

[28]  Mircea Nicolescu,et al.  Vision-based hand pose estimation: A review , 2007, Comput. Vis. Image Underst..