Online optical marker-based hand tracking with deep labels

Optical marker-based motion capture is the dominant way for obtaining high-fidelity human body animation for special effects, movies, and video games. However, motion capture has seen limited application to the human hand due to the difficulty of automatically identifying (or labeling) identical markers on self-similar fingers. We propose a technique that frames the labeling problem as a keypoint regression problem conducive to a solution using convolutional neural networks. We demonstrate robustness of our labeling solution to occlusion, ghost markers, hand shape, and even motions involving two hands or handheld objects. Our technique is equally applicable to sparse or dense marker sets and can run in real-time to support interaction prototyping with high-fidelity hand tracking and hand presence in virtual reality.

[1]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[2]  Jonas Beskow,et al.  Real-time labeling of non-rigid motion capture marker sets , 2017, Comput. Graph..

[3]  Tameem Antoniades Creating a live real-time performance-captured digital human , 2016, SIGGRAPH Real-Time Live!.

[4]  Antonis A. Argyros,et al.  Tracking the articulated motion of two strongly interacting hands , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Gernot Riegler,et al.  OctNet: Learning Deep 3D Representations at High Resolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Andrew W. Fitzgibbon,et al.  Efficient and precise interactive hand tracking through joint, continuous optimization of pose and correspondences , 2016, ACM Trans. Graph..

[7]  Andrea Tagliasacchi,et al.  Sphere-meshes for real-time hand modeling and tracking , 2016, ACM Trans. Graph..

[8]  Michael Neff,et al.  State of the Art in Hand and Finger Modeling and Animation , 2015, Comput. Graph. Forum.

[9]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[10]  Jonathan Maycock,et al.  Reduced marker layouts for optical motion capture of hands , 2015, MIG.

[11]  Midori Kitagawa,et al.  MoCap for Artists: Workflow and Techniques for Motion Capture , 2008 .

[12]  Victor B. Zordan,et al.  Automatic Hand-Over Animation using Principle Component Analysis , 2013, MIG.

[13]  Wolfram Burgard,et al.  Automatic initialization for skeleton tracking in optical motion capture , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[14]  Wolfram Burgard,et al.  Online marker labeling for fully automatic skeleton tracking in optical motion capture , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[15]  Joan Lasenby,et al.  Multiple Hypothesis Tracking for Automatic Optical Motion Capture , 2002, ECCV.

[16]  Helge J. Ritter,et al.  Design and evaluation of reduced marker layouts for hand motion capture , 2018, Comput. Animat. Virtual Worlds.

[17]  Joan Lasenby,et al.  A procedure for automatically estimating model parameters in optical motion capture , 2004, Image Vis. Comput..

[18]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[19]  Midori Kitagawa,et al.  Hand Motion Capture , 2008 .

[20]  Arezoo Eshraghi,et al.  Vicon Motion System , 2014 .

[21]  Jonathan Maycock,et al.  Fully automatic optical motion tracking using an inverse kinematics approach , 2015, 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids).

[22]  George Kurian,et al.  Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.

[23]  Andreas Aristidou,et al.  Motion capture with constrained inverse kinematics for real-time hand tracking , 2010, 2010 4th International Symposium on Communications, Control and Signal Processing (ISCCSP).

[24]  Jonas Beskow,et al.  Robust online motion capture labeling of finger markers , 2016, MIG.

[25]  Ronald L. Rivest,et al.  Introduction to Algorithms, third edition , 2009 .

[26]  Yaser Sheikh,et al.  Hand Keypoint Detection in Single Images Using Multiview Bootstrapping , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Saurabh Singh,et al.  Part Localization using Multi-Proposal Consensus for Fine-Grained Categorization , 2015, BMVC.

[28]  Georgios Tzimiropoulos,et al.  Human Pose Estimation via Convolutional Part Heatmap Regression , 2016, ECCV.

[29]  Wolfram Burgard,et al.  Automatic bone parameter estimation for skeleton tracking in optical motion capture , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[30]  David Kim,et al.  Articulated distance fields for ultra-fast tracking of hands interacting , 2017, ACM Trans. Graph..

[31]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Michael Neff,et al.  Automatic Hand-Over Animation for Free-Hand Motions from Low Resolution Input , 2012, MIG.

[33]  Ken Perlin,et al.  Real-Time Continuous Pose Recovery of Human Hands Using Convolutional Networks , 2014, ACM Trans. Graph..