Real-time finger tracking using active motion capture: a neural network approach robust to occlusions

Hands deserve particular attention in virtual reality (VR) applications because they represent our primary means for interacting with the environment. Although marker-based motion capture with inverse kinematics works adequately for full body tracking, it is less reliable for small body parts such as hands and fingers which are often occluded when captured optically, thus leading VR professionals to rely on additional systems (e.g. inertial trackers). We present a machine learning pipeline to track hands and fingers using solely a motion capture system based on cameras and active markers. Our finger animation is performed by a predictive model based on neural networks trained on a movements dataset acquired from several subjects with a complementary capture system. We employ a two-stage pipeline, which first resolves occlusions, and then recovers all joint transformations. We show that our method compares favorably to inverse kinematics by inferring automatically the constraints from the data, provides a natural reconstruction of postures, and handles occlusions better than three proposed baselines.

[1]  F. Sebastian Grassia,et al.  Practical Parameterization of Rotations Using the Exponential Map , 1998, J. Graphics, GPU, & Game Tools.

[2]  Kenrick Kin,et al.  Online optical marker-based hand tracking with deep labels , 2018, ACM Trans. Graph..

[3]  Taku Komura,et al.  Learning motion manifolds with convolutional autoencoders , 2015, SIGGRAPH Asia Technical Briefs.

[4]  Adrian Hilton,et al.  A survey of advances in vision-based human motion capture and analysis , 2006, Comput. Vis. Image Underst..

[5]  Wei Zhang,et al.  Deep Kinematic Pose Regression , 2016, ECCV Workshops.

[6]  Ariel Shamir,et al.  Inverse Kinematics Techniques in Computer Graphics: A Survey , 2018, Comput. Graph. Forum.

[7]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[8]  Pascal Fua,et al.  Skeleton-based motion capture for robust reconstruction of human motion , 2000, Proceedings Computer Animation 2000.

[9]  Karen Rafferty,et al.  Improving the ergonomics of hand tracking inputs to VR HMDs , 2017, WSCG 2017.

[10]  Arthur E. Hoerl,et al.  Ridge Regression: Biased Estimation for Nonorthogonal Problems , 2000, Technometrics.

[11]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[12]  Andrea Tagliasacchi,et al.  Sphere-meshes for real-time hand modeling and tracking , 2016, ACM Trans. Graph..

[13]  Byoung-Ho Kim An Adaptive Neural Network Learning-based Solution for the Inverse Kinematics of Humanoid Fingers , 2014 .

[14]  Morten Fjeld,et al.  Predicting Missing Markers in Real-Time Optical Motion Capture , 2009, 3DPH.

[15]  Yoshua Bengio,et al.  Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[16]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[17]  Ronan Boulic,et al.  The Critical Role of Self-Contact for Embodiment in Virtual Reality , 2018, IEEE Transactions on Visualization and Computer Graphics.

[18]  Christos Mousas,et al.  Real-time performance-driven finger motion synthesis , 2017, Comput. Graph..

[19]  L. Canan Dülger,et al.  A New Artificial Neural Network Approach in Solving Inverse Kinematics of Robotic Arm (Denso VP6242) , 2016, Comput. Intell. Neurosci..

[20]  Geoffrey E. Hinton,et al.  Modeling Human Motion Using Binary Latent Variables , 2006, NIPS.

[21]  H. Bourlard,et al.  Auto-association by multilayer perceptrons and singular value decomposition , 1988, Biological Cybernetics.

[22]  W. Kabsch A solution for the best rotation to relate two sets of vectors , 1976 .

[23]  Ronan Boulic,et al.  Egocentric Mapping of Body Surface Constraints , 2018, IEEE Transactions on Visualization and Computer Graphics.

[24]  Taku Komura,et al.  Real-time Physics-based Motion Capture with Sparse Sensors , 2016, CVMP 2016.

[25]  Christian Theobalt,et al.  Real-Time Hand Tracking Under Occlusion from an Egocentric RGB-D Sensor , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[26]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[27]  Jonathan Maycock,et al.  Reduced marker layouts for optical motion capture of hands , 2015, MIG.

[28]  Yichen Wei,et al.  Model-Based Deep Hand Pose Estimation , 2016, IJCAI.

[29]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[30]  Benjamin Schrauwen,et al.  Towards Learning Inverse Kinematics with a Neural Network Based Tracking Controller , 2011, ICONIP.

[31]  Jonas Beskow,et al.  Real-time labeling of non-rigid motion capture marker sets , 2017, Comput. Graph..

[32]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[33]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[34]  J. Cameron,et al.  Real-Time Estimation of Missing Markers in Human Motion Capture , 2008, 2008 2nd International Conference on Bioinformatics and Biomedical Engineering.

[35]  Daniel Holden,et al.  Robust solving of optical motion capture data by denoising , 2018, ACM Trans. Graph..