Patch-Based Non-rigid 3D Reconstruction from a Single Depth Stream

We propose an approach for 3D reconstruction and tracking of dynamic surfaces using a single depth sensor, without any prior knowledge of the scene. It is robust to rapid inter-frame motions due to the probabilistic expectation-maximization non-rigid registration framework. Our pipeline subdivides each input depth image into non-rigidly connected surface patches, and deforms it towards the canonical pose by estimating a rigid transformation for each patch. The combination of a data term imposing similarity between model and data, and a regularizer enforcing as-rigid-as-possible motion of neighboring patches ensures that we can handle large deformations, while coping with sensor noise. We employ a surfel-based fusion technique, which lets us circumvent the repeated conversion between mesh and signed distance field representations which are used by related techniques. Furthermore, a robust keyframe-based scheme allows us to keep track of correspondences throughout the entire sequence. Through a variety of qualitative and quantitative experiments, we demonstrate resistance to larger motion and achieving lower reconstruction errors than related approaches.

[1]  Paul J. Besl,et al.  Method for registration of 3-D shapes , 1992, Other Conferences.

[2]  Andrew W. Fitzgibbon,et al.  Fits Like a Glove: Rapid and Reliable Hand Shape Personalization , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Hongdong Li,et al.  “Maximizing Rigidity” Revisited: A Convex Programming Approach for Generic 3D Shape Reconstruction from Multiple Perspective Views , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[4]  Pan Ji Maximizing Rigidity ” Revisited : a Convex Programming Approach for Generic 3 D Shape Reconstruction from Multiple Perspective Views , 2018 .

[5]  Tao Yu,et al.  Real-time geometry, albedo and motion reconstruction using a single RGBD camera , 2017, TOGS.

[6]  Dieter Fox,et al.  DynamicFusion: Reconstruction and tracking of non-rigid scenes in real-time , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Matthias Nießner,et al.  State of the Art on 3D Reconstruction with RGB‐D Cameras , 2018, Comput. Graph. Forum.

[8]  Jovan Popović,et al.  Dynamic shape capture using multi-view photometric stereo , 2009, SIGGRAPH 2009.

[9]  Slobodan Ilic,et al.  Free-form mesh tracking: A patch-based approach , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[10]  William E. Lorensen,et al.  Marching cubes: A high resolution 3D surface construction algorithm , 1987, SIGGRAPH.

[11]  Takeo Kanade,et al.  Panoptic Studio: A Massively Multiview System for Social Motion Capture , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[12]  Benjamin Irving,et al.  maskSLIC: Regional Superpixel Generation with Application to Local Pathology Characterisation in Medical Images , 2016 .

[13]  Andrew W. Fitzgibbon,et al.  Real-time non-rigid reconstruction using an RGB-D camera , 2014, ACM Trans. Graph..

[14]  Wojciech Matusik,et al.  Articulated mesh animation from multi-view silhouettes , 2008, ACM Trans. Graph..

[15]  Dieter Fox,et al.  Self-Supervised Visual Descriptor Learning for Dense Correspondence , 2017, IEEE Robotics and Automation Letters.

[16]  Matthias Zwicker,et al.  Surface splatting , 2001, SIGGRAPH.

[17]  Hans-Peter Seidel,et al.  Performance capture from sparse multi-view video , 2008, ACM Trans. Graph..

[18]  Shahram Izadi,et al.  Motion2fusion , 2017, ACM Trans. Graph..

[19]  Matthias Nießner,et al.  VolumeDeform: Real-Time Volumetric Non-rigid Reconstruction , 2016, ECCV.

[20]  Pushmeet Kohli,et al.  Fusion4D , 2016, ACM Trans. Graph..

[21]  Hans-Peter Seidel,et al.  Motion capture using joint skeleton tracking and surface estimation , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Tao Yu,et al.  BodyFusion: Real-Time Capture of Human Motion and Surface Geometry Using a Single Depth Camera , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[23]  Lourdes Agapito,et al.  Energy based multiple model fitting for non-rigid structure from motion , 2011, CVPR 2011.

[24]  Nassir Navab,et al.  SDF-2-SDF: Highly Accurate 3D Object Reconstruction , 2016, ECCV.

[25]  Rui Yu,et al.  Direct, Dense, and Deformable: Template-Based Non-rigid 3D Reconstruction from RGB Video , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[26]  Katsushi Ikeuchi,et al.  Locally rigid globally non-rigid surface registration , 2011, 2011 International Conference on Computer Vision.

[27]  Kazufumi Kaneda,et al.  Softassign and EM-ICP on GPU , 2010, 2010 First International Conference on Networking and Computing.

[28]  Slobodan Ilic,et al.  Probabilistic Deformable Surface Tracking from Multiple Videos , 2010, ECCV.

[29]  Daniel Cremers,et al.  KillingFusion: Non-rigid 3D Reconstruction without Correspondences , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  A. Bartoli,et al.  Locally Affine and Planar Deformable Surface Reconstruction from Video , 2010 .

[31]  Tim Weyrich,et al.  Real-Time 3D Reconstruction in Dynamic Scenes Using Point-Based Fusion , 2013, 2013 International Conference on 3D Vision.

[32]  Xiaoyang Liu,et al.  Real-Time Geometry, Albedo, and Motion Reconstruction Using a Single RGB-D Camera , 2017, ACM Trans. Graph..

[33]  Andrew W. Fitzgibbon,et al.  3D scanning deformable objects with a single RGBD sensor , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Matthias Zwicker,et al.  Surfels: surface elements as rendering primitives , 2000, SIGGRAPH.

[35]  Pascal Fua,et al.  Template-free monocular reconstruction of deformable surfaces , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[36]  Michael J. Black,et al.  Detailed Full-Body Reconstructions of Moving People from Monocular RGB-D Sequences , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[37]  Andrew W. Fitzgibbon,et al.  Better Together: Joint Reasoning for Non-rigid 3D Reconstruction with Specularities and Shading , 2017, BMVC.

[38]  Justus Thies,et al.  Face2Face: real-time face capture and reenactment of RGB videos , 2019, Commun. ACM.

[39]  Alvaro Collet,et al.  High-quality streamable free-viewpoint video , 2015, ACM Trans. Graph..

[40]  Alessio Del Bue,et al.  Piecewise Quadratic Reconstruction of Non-Rigid Surfaces from Monocular Sequences , 2010, ECCV.