BodyFusion: Real-Time Capture of Human Motion and Surface Geometry Using a Single Depth Camera

We propose BodyFusion, a novel real-time geometry fusion method that can track and reconstruct non-rigid surface motion of a human performance using a single consumer-grade depth camera. To reduce the ambiguities of the non-rigid deformation parameterization on the surface graph nodes, we take advantage of the internal articulated motion prior for human performance and contribute a skeleton-embedded surface fusion (SSF) method. The key feature of our method is that it jointly solves for both the skeleton and graph-node deformations based on information of the attachments between the skeleton and the graph nodes. The attachments are also updated frame by frame based on the fused surface geometry and the computed deformations. Overall, our method enables increasingly denoised, detailed, and complete surface reconstruction as well as the updating of the skeleton and attachments as the temporal depth frames are fused. Experimental results show that our method exhibits substantially improved nonrigid motion fusion performance and tracking robustness compared with previous state-of-the-art fusion methods. We also contribute a dataset for the quantitative evaluation of fusion-based dynamic scene reconstruction algorithms using a single depth camera.

[1]  Pushmeet Kohli,et al.  Fusion4D , 2016, ACM Trans. Graph..

[2]  Qionghai Dai,et al.  Performance Capture of Interacting Characters with Handheld Kinects , 2012, ECCV.

[3]  M. Pauly,et al.  Embedded deformation for shape manipulation , 2007, SIGGRAPH 2007.

[4]  Jinxiang Chai,et al.  Accurate realtime full-body motion capture using a single depth camera , 2012, ACM Trans. Graph..

[5]  Hans-Peter Seidel,et al.  Fast articulated motion tracking using a sums of Gaussians body model , 2011, 2011 International Conference on Computer Vision.

[6]  Jovan Popovic,et al.  Automatic rigging and animation of 3D characters , 2007, ACM Trans. Graph..

[7]  Joris De Schutter,et al.  An adaptable system for RGB-D based human body detection and pose estimation , 2014, J. Vis. Commun. Image Represent..

[8]  Leonidas J. Guibas,et al.  Robust single-view geometry and motion reconstruction , 2009, ACM Trans. Graph..

[9]  Andrew W. Fitzgibbon,et al.  3D scanning deformable objects with a single RGBD sensor , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Kun Zhou,et al.  3D shape regression for real-time facial animation , 2013, ACM Trans. Graph..

[11]  Hao Li,et al.  Global Correspondence Optimization for Non‐Rigid Registration of Depth Scans , 2008, Comput. Graph. Forum.

[12]  Michael J. Black,et al.  Coregistration: Simultaneous Alignment and Modeling of Articulated 3D Shape , 2012, ECCV.

[13]  Craig Gotsman,et al.  Articulated Object Reconstruction and Markerless Motion Capture from Depth Video , 2008, Comput. Graph. Forum.

[14]  Jonathan T. Barron,et al.  3D self-portraits , 2013, ACM Trans. Graph..

[15]  Hans-Peter Seidel,et al.  Markerless motion capture of interacting characters using multi-view image segmentation , 2011, CVPR 2011.

[16]  Yangang Wang,et al.  Online modeling for realtime facial animation , 2013, ACM Trans. Graph..

[17]  Xiaoyang Liu,et al.  Real-Time Geometry, Albedo, and Motion Reconstruction Using a Single RGB-D Camera , 2017, ACM Trans. Graph..

[18]  Matthias Zwicker,et al.  Global registration of dynamic range scans for articulated model reconstruction , 2011, TOGS.

[19]  Matthias Zwicker,et al.  Range Scan Registration Using Reduced Deformable Models , 2009, Comput. Graph. Forum.

[20]  Dieter Fox,et al.  DynamicFusion: Reconstruction and tracking of non-rigid scenes in real-time , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Christian Theobalt,et al.  On-set performance capture of multiple actors with a stereo camera , 2013, ACM Trans. Graph..

[22]  Hans-Peter Seidel,et al.  Motion capture using joint skeleton tracking and surface estimation , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Charles T. Loop,et al.  Holoportation: Virtual 3D Teleportation in Real-time , 2016, UIST.

[24]  Andrew W. Fitzgibbon,et al.  Real-time non-rigid reconstruction using an RGB-D camera , 2014, ACM Trans. Graph..

[25]  Bodo Rosenhahn,et al.  Multisensor-fusion for 3D full-body human motion capture , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[26]  Wojciech Matusik,et al.  Articulated mesh animation from multi-view silhouettes , 2008, ACM Trans. Graph..

[27]  Huamin Wang,et al.  Modeling deformable objects from a single depth camera , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[28]  Richard M. Murray,et al.  A Mathematical Introduction to Robotic Manipulation , 1994 .

[29]  Qionghai Dai,et al.  Robust Non-rigid Motion Tracking and Surface Reconstruction Using L0 Regularization , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[30]  Hans-Peter Seidel,et al.  Animation cartography—intrinsic reconstruction of shape and motion , 2012, TOGS.

[31]  William E. Lorensen,et al.  Marching cubes: A high resolution 3D surface construction algorithm , 1987, SIGGRAPH.

[32]  Michael J. Black,et al.  Detailed Full-Body Reconstructions of Moving People from Monocular RGB-D Sequences , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[33]  Günther Greiner,et al.  Reconstructing Animated Meshes from Time‐Varying Point Clouds , 2008, Comput. Graph. Forum.

[34]  Toby Sharp,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR.

[35]  Marc Levoy,et al.  A volumetric method for building complex models from range images , 1996, SIGGRAPH.

[36]  Marc Levoy,et al.  Efficient variants of the ICP algorithm , 2001, Proceedings Third International Conference on 3-D Digital Imaging and Modeling.

[37]  Ruigang Yang,et al.  Real-Time Simultaneous Pose and Shape Estimation for Articulated Objects Using a Single Depth Camera , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Jan Bender,et al.  Efficient GPU Data Structures and Methods to Solve Sparse Linear Systems in Dynamics Applications , 2013, Comput. Graph. Forum.

[39]  Chen Qian,et al.  Realtime and Robust Hand Tracking from Depth , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[40]  Antonis A. Argyros,et al.  Efficient model-based 3D tracking of hand articulations using Kinect , 2011, BMVC.

[41]  Hans-Peter Seidel,et al.  Efficient reconstruction of nonrigid shape and motion from real-time 3D scanner data , 2009, TOGS.

[42]  Daniel Cohen-Or,et al.  Space-time surface reconstruction using incompressible flow , 2008, ACM Trans. Graph..

[43]  Leonidas J. Guibas,et al.  Dynamic geometry registration , 2007, Symposium on Geometry Processing.

[44]  Olivier Dionne,et al.  Geodesic voxel binding for production character meshes , 2013, SCA '13.

[45]  Matthias Nießner,et al.  VolumeDeform: Real-Time Volumetric Non-rigid Reconstruction , 2016, ECCV.