Personalized Hand Modeling from Multiple Postures with Multi‐View Color Images

Personalized hand models can be utilized to synthesize high quality hand datasets, provide more possible training data for deep learning and improve the accuracy of hand pose estimation. In recent years, parameterized hand models, e.g., MANO, are widely used for obtaining personalized hand models. However, due to the low resolution of existing parameterized hand models, it is still hard to obtain high‐fidelity personalized hand models. In this paper, we propose a new method to estimate personalized hand models from multiple hand postures with multi‐view color images. The personalized hand model is represented by a personalized neutral hand, and multiple hand postures. We propose a novel optimization strategy to estimate the neutral hand from multiple hand postures. To demonstrate the performance of our method, we have built a multi‐view system and captured more than 35 people, and each of them has 30 hand postures. We hope the estimated hand models can boost the research of high‐fidelity parameterized hand modeling in the future. All the hand models are publicly available on www.yangangwang.com.

[1]  Michael J. Black,et al.  Learning a model of facial shape and expression from 4D scans , 2017, ACM Trans. Graph..

[2]  Dimitrios Tzionas,et al.  Expressive Body Capture: 3D Hands, Face, and Body From a Single Image , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Marc Pollefeys,et al.  A multiple-camera system calibration toolbox using a feature descriptor-based calibration pattern , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[4]  Cordelia Schmid,et al.  Learning Joint Reconstruction of Hands and Manipulated Objects , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Kyoung Mu Lee,et al.  V2V-PoseNet: Voxel-to-Voxel Prediction Network for Accurate 3D Hand and Human Pose Estimation from a Single Depth Map , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[6]  Michael J. Black,et al.  Detailed Full-Body Reconstructions of Moving People from Monocular RGB-D Sequences , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[7]  Richard M. Murray,et al.  A Mathematical Introduction to Robotic Manipulation , 1994 .

[8]  Andrew W. Fitzgibbon,et al.  Online generative model personalization for hand tracking , 2017, ACM Trans. Graph..

[9]  Andrea Tagliasacchi,et al.  Low-Dimensionality Calibration through Local Anisotropic Scaling for Robust Hand Model Personalization , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[10]  Miguel A. Otaduy,et al.  Real-time pose and shape reconstruction of two interacting hands with a single depth camera , 2019, ACM Trans. Graph..

[11]  Varun Ramakrishna,et al.  User-Specific Hand Modeling from Monocular Depth Sequences , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Zhaohui Zhang,et al.  HandAugment: A Simple Data Augmentation Method for Depth-Based 3D Hand Pose Estimation , 2020, ArXiv.

[13]  Mingliang Chen,et al.  A hand pose tracking benchmark from stereo matching , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[14]  Didier Stricker,et al.  Structure-Aware 3D Hand Pose Regression from a Single Depth Image , 2018, EuroVR.

[15]  Qiang Li,et al.  End-to-End Hand Mesh Recovery From a Monocular RGB Image , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[16]  Jianfei Cai,et al.  3D Hand Shape and Pose Estimation from a Single RGB Image (Supplementary Material) , 2019 .

[17]  Michael J. Black,et al.  FAUST: Dataset and Evaluation for 3D Mesh Registration , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Leonidas J. Guibas,et al.  Robust single-view geometry and motion reconstruction , 2009, ACM Trans. Graph..

[19]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Yangang Wang,et al.  SRHandNet: Real-Time 2D Hand Pose Estimation With Simultaneous Region Localization , 2019, IEEE Transactions on Image Processing.

[21]  Luc Van Gool,et al.  Motion Capture of Hands in Action Using Discriminative Salient Points , 2012, ECCV.

[22]  Peter V. Gehler,et al.  Keep It SMPL: Automatic Estimation of 3D Human Pose and Shape from a Single Image , 2016, ECCV.

[23]  Andrew W. Fitzgibbon,et al.  Fits Like a Glove: Rapid and Reliable Hand Shape Personalization , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Gérard G. Medioni,et al.  Capturing Dynamic Textured Surfaces of Moving Targets , 2016, ECCV.

[25]  Qionghai Dai,et al.  Video-based hand manipulation capture through composite motion control , 2013, ACM Trans. Graph..

[26]  Andrew W. Fitzgibbon,et al.  Efficient and precise interactive hand tracking through joint, continuous optimization of pose and correspondences , 2016, ACM Trans. Graph..

[27]  Yaser Sheikh,et al.  Hand Keypoint Detection in Single Images Using Multiview Bootstrapping , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Sergio Escalera,et al.  Depth-Based 3D Hand Pose Estimation: From Current Achievements to Future Goals , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[29]  Matthias Zwicker,et al.  Global registration of dynamic range scans for articulated model reconstruction , 2011, TOGS.

[30]  Chaitanya Patel,et al.  HumanMeshNet: Polygonal Mesh Recovery of Humans , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[31]  Michael J. Black,et al.  Dynamic FAUST: Registering Human Bodies in Motion , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Andrew L. Maas Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .

[33]  Yaser Sheikh,et al.  Total Capture: A 3D Deformation Model for Tracking Faces, Hands, and Bodies , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[34]  Olga Sorkine-Hornung,et al.  Bounded biharmonic weights for real-time deformation , 2011, Commun. ACM.

[35]  Daniel Thalmann,et al.  Robust 3D Hand Pose Estimation in Single Depth Images: From Single-View CNN to Multi-View CNNs , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Jan-Michael Frahm,et al.  Pixelwise View Selection for Unstructured Multi-View Stereo , 2016, ECCV.

[37]  Thomas Brox,et al.  FreiHAND: A Dataset for Markerless Capture of Hand Pose and Shape From Single RGB Images , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[38]  M. Pauly,et al.  Embedded deformation for shape manipulation , 2007, SIGGRAPH 2007.

[39]  Tae-Kyun Kim,et al.  Pushing the Envelope for RGB-Based Dense 3D Hand Pose Estimation via Neural Rendering , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Wei Hu,et al.  GraphPoseGAN: 3D Hand Pose Estimation from a Monocular RGB Image via Adversarial Learning on Graphs , 2019, ArXiv.

[41]  Vladlen Koltun,et al.  Robust Nonrigid Registration by Convex Optimization , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[42]  Marc Pollefeys,et al.  Capturing Hands in Action Using Discriminative Salient Points and Physics Simulation , 2015, International Journal of Computer Vision.

[43]  Philip H. S. Torr,et al.  3D Hand Shape and Pose From Images in the Wild , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Wendy Sandler,et al.  Sign Language and Linguistic Universals: Entering the lexicon: lexicalization, backformation, and cross-modal borrowing , 2006 .

[45]  Christian Theobalt,et al.  Monocular Real-Time Hand Shape and Motion Capture Using Multi-Modal Data , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Jovan Popovic,et al.  Automatic rigging and animation of 3D characters , 2007, ACM Trans. Graph..

[47]  Andrea Tagliasacchi,et al.  Robust Articulated-ICP for Real-Time Hand Tracking , 2015 .

[48]  Zhengyou Zhang,et al.  Iterative point matching for registration of free-form curves and surfaces , 1994, International Journal of Computer Vision.

[49]  Jernej Barbic,et al.  Hand modeling and simulation using stabilized magnetic resonance imaging , 2019, ACM Trans. Graph..

[50]  Michael J. Black,et al.  Coregistration: Simultaneous Alignment and Modeling of Articulated 3D Shape , 2012, ECCV.

[51]  Angela Yao,et al.  Aligning Latent Spaces for 3D Hand Pose Estimation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[52]  Gary K. L. Tam,et al.  Registration of 3D Point Clouds and Meshes: A Survey from Rigid to Nonrigid , 2013, IEEE Transactions on Visualization and Computer Graphics.