Real-time neural network prediction for handling two-hands mutual occlusions

Abstract Hands deserve particular attention in virtual reality (VR) applications because they represent our primary means for interacting with the environment. Although marker-based motion capture works adequately for full body tracking, it is less reliable for small body parts such as hands and fingers which are often occluded when captured optically, thus leading VR professionals to rely on additional systems (e.g. inertial trackers). We present a machine learning pipeline to track hands and fingers using solely a motion capture system based on cameras and active markers. Our finger animation is performed by a predictive model based on neural networks trained on a movements dataset acquired from several subjects with a complementary capture system. We employ a two-stage pipeline that first resolves occlusions and then recovers all joint transformations. We show that our method compares favorably to inverse kinematics by inferring automatically the constraints from the data, provides a natural reconstruction of postures, and handles occlusions better than three proposed baselines.

[1]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[2]  Benjamin Schrauwen,et al.  Towards Learning Inverse Kinematics with a Neural Network Based Tracking Controller , 2011, ICONIP.

[3]  Kai Jiang,et al.  Classification for Incomplete Data Using Classifier Ensembles , 2005, 2005 International Conference on Neural Networks and Brain.

[4]  Ariel Shamir,et al.  Inverse Kinematics Techniques in Computer Graphics: A Survey , 2018, Comput. Graph. Forum.

[5]  F. Sebastian Grassia,et al.  Practical Parameterization of Rotations Using the Exponential Map , 1998, J. Graphics, GPU, & Game Tools.

[6]  Adrian Hilton,et al.  A survey of advances in vision-based human motion capture and analysis , 2006, Comput. Vis. Image Underst..

[7]  Yoshua Bengio,et al.  Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[8]  Ronan Boulic,et al.  Egocentric Mapping of Body Surface Constraints , 2018, IEEE Transactions on Visualization and Computer Graphics.

[9]  Christian Theobalt,et al.  Real-Time Hand Tracking under Occlusion from an Egocentric RGB-D Sensor , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[10]  Jonas Beskow,et al.  Real-time labeling of non-rigid motion capture marker sets , 2017, Comput. Graph..

[11]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[12]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[13]  Andrea Tagliasacchi,et al.  Sphere-meshes for real-time hand modeling and tracking , 2016, ACM Trans. Graph..

[14]  Jonathan Maycock,et al.  Reduced marker layouts for optical motion capture of hands , 2015, MIG.

[15]  Martin J. Wainwright,et al.  Noisy matrix decomposition via convex relaxation: Optimal rates in high dimensions , 2011, ICML.

[16]  Ronan Boulic,et al.  The Critical Role of Self-Contact for Embodiment in Virtual Reality , 2018, IEEE Transactions on Visualization and Computer Graphics.

[17]  Christos Mousas,et al.  Real-time performance-driven finger motion synthesis , 2017, Comput. Graph..

[18]  Taku Komura,et al.  Real-time Physics-based Motion Capture with Sparse Sensors , 2016, CVMP 2016.

[19]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[20]  H. Bourlard,et al.  Auto-association by multilayer perceptrons and singular value decomposition , 1988, Biological Cybernetics.

[21]  W. Kabsch A solution for the best rotation to relate two sets of vectors , 1976 .

[22]  Morten Fjeld,et al.  Predicting Missing Markers in Real-Time Optical Motion Capture , 2009, 3DPH.

[23]  Yichen Wei,et al.  Model-Based Deep Hand Pose Estimation , 2016, IJCAI.

[24]  Karen Rafferty,et al.  Improving the ergonomics of hand tracking inputs to VR HMDs , 2017, WSCG 2017.

[25]  Wei Zhang,et al.  Deep Kinematic Pose Regression , 2016, ECCV Workshops.

[26]  L. Canan Dülger,et al.  A New Artificial Neural Network Approach in Solving Inverse Kinematics of Robotic Arm (Denso VP6242) , 2016, Comput. Intell. Neurosci..

[27]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[28]  Daniel Holden,et al.  Robust solving of optical motion capture data by denoising , 2018, ACM Trans. Graph..

[29]  A. E. Hoerl,et al.  Ridge regression: biased estimation for nonorthogonal problems , 2000 .

[30]  Byoung-Ho Kim An Adaptive Neural Network Learning-based Solution for the Inverse Kinematics of Humanoid Fingers , 2014 .

[31]  Ronan Boulic,et al.  Real-time finger tracking using active motion capture: a neural network approach robust to occlusions , 2018, MIG.

[32]  J. Cameron,et al.  Real-Time Estimation of Missing Markers in Human Motion Capture , 2008, 2008 2nd International Conference on Bioinformatics and Biomedical Engineering.

[33]  Kenrick Kin,et al.  Online optical marker-based hand tracking with deep labels , 2018, ACM Trans. Graph..