Shading-based refinement on volumetric signed distance functions

We present a novel method to obtain fine-scale detail in 3D reconstructions generated with low-budget RGB-D cameras or other commodity scanning devices. As the depth data of these sensors is noisy, truncated signed distance fields are typically used to regularize out the noise, which unfortunately leads to over-smoothed results. In our approach, we leverage RGB data to refine these reconstructions through shading cues, as color input is typically of much higher resolution than the depth data. As a result, we obtain reconstructions with high geometric detail, far beyond the depth resolution of the camera itself. Our core contribution is shading-based refinement directly on the implicit surface representation, which is generated from globally-aligned RGB-D images. We formulate the inverse shading problem on the volumetric distance field, and present a novel objective function which jointly optimizes for fine-scale surface geometry and spatially-varying surface reflectance. In order to enable the efficient reconstruction of sub-millimeter detail, we store and process our surface using a sparse voxel hashing scheme which we augment by introducing a grid hierarchy. A tailored GPU-based Gauss-Newton solver enables us to refine large shape models to previously unseen resolution within only a few seconds.

[1]  Christian Theobalt,et al.  On-set performance capture of multiple actors with a stereo camera , 2013, ACM Trans. Graph..

[2]  Roberto Cipolla,et al.  Multiview Photometric Stereo , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Daniel Cremers,et al.  A Super-Resolution Framework for High-Accuracy Multiview Reconstruction , 2013, International Journal of Computer Vision.

[4]  John Hart,et al.  ACM Transactions on Graphics , 2004, SIGGRAPH 2004.

[5]  Andrew W. Fitzgibbon,et al.  Real-time non-rigid reconstruction using an RGB-D camera , 2014, ACM Trans. Graph..

[6]  Marsette Vona,et al.  Moving Volume KinectFusion , 2012, BMVC.

[7]  Sebastian Thrun,et al.  3D shape scanning with a time-of-flight camera , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[8]  Dani Lischinski,et al.  Joint bilateral upsampling , 2007, SIGGRAPH 2007.

[9]  Luc Van Gool,et al.  In-hand scanning with online loop closure , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[10]  Tony DeRose,et al.  Surface reconstruction from unorganized points , 1992, SIGGRAPH.

[11]  William T. Freeman,et al.  Diffuse reflectance imaging with astronomical applications , 2011, 2011 International Conference on Computer Vision.

[12]  Shahram Izadi,et al.  Real-time shading-based refinement for consumer depth cameras , 2014, ACM Trans. Graph..

[13]  Richard Szeliski,et al.  Building Rome in a day , 2009, ICCV.

[14]  Jiawen Chen,et al.  Scalable real-time volumetric surface reconstruction , 2013, ACM Trans. Graph..

[15]  Szymon Rusinkiewicz,et al.  Efficiently combining positions and normals for precise 3D geometry , 2005, ACM Trans. Graph..

[16]  Nassir Navab,et al.  Coloured signed distance fields for full 3D object reconstruction , 2014, BMVC.

[17]  Ping-Sing Tsai,et al.  Shape from Shading: A Survey , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  Stephen Lin,et al.  Estimation of Intrinsic Image Sequences from Image+Depth Video , 2012, ECCV.

[19]  M. Goesele,et al.  Floating scale surface reconstruction , 2014, ACM Trans. Graph..

[20]  Daniel Cremers,et al.  Real-Time Camera Tracking and 3D Reconstruction Using Signed Distance Functions , 2013, Robotics: Science and Systems.

[21]  Sebastian Thrun,et al.  LidarBoost: Depth superresolution for ToF 3D shape scanning , 2009, CVPR.

[22]  Hans-Peter Seidel,et al.  Coherent Spatiotemporal Filtering, Upsampling and Rendering of RGBZ Videos , 2012, Comput. Graph. Forum.

[23]  Andrew J. Davison,et al.  Live dense reconstruction with a single moving camera , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[24]  Steven M. Seitz,et al.  Photo tourism: exploring photo collections in 3D , 2006, ACM Trans. Graph..

[25]  Jan Bender,et al.  Efficient GPU Data Structures and Methods to Solve Sparse Linear Systems in Dynamics Applications , 2013, Comput. Graph. Forum.

[26]  Marc Levoy,et al.  Real-time 3D model acquisition , 2002, ACM Trans. Graph..

[27]  Sebastian Thrun,et al.  Upsampling range data in dynamic environments , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[28]  Michael S. Brown,et al.  High quality depth map upsampling for 3D-TOF cameras , 2011, 2011 International Conference on Computer Vision.

[29]  Marc Levoy,et al.  The digital Michelangelo project: 3D scanning of large statues , 2000, SIGGRAPH.

[30]  Sebastian Thrun,et al.  An Application of Markov Random Fields to Range Sensing , 2005, NIPS.

[31]  K. Hartmann,et al.  Data-Fusion of PMD-Based Distance-Information and High-Resolution RGB-Images , 2007, 2007 International Symposium on Signals, Circuits and Systems.

[32]  Vladlen Koltun,et al.  A Simple Model for Intrinsic Image Decomposition with Depth Cues , 2013, 2013 IEEE International Conference on Computer Vision.

[33]  Paul Debevec,et al.  The Light Stages and Their Applications to Photoreal Digital Actors , 2012, SIGGRAPH 2012.

[34]  Andrew W. Fitzgibbon,et al.  KinectFusion: real-time 3D reconstruction and interaction using a moving depth camera , 2011, UIST.

[35]  Richard K. Beatson,et al.  Reconstruction and representation of 3D objects with radial basis functions , 2001, SIGGRAPH.

[36]  Marcus A. Magnor,et al.  A Survey on Time-of-Flight Stereo Fusion , 2013, Time-of-Flight and Depth Imaging.

[37]  Shahram Izadi,et al.  MonoFusion: Real-time 3D reconstruction of small scenes with a single web camera , 2013, 2013 IEEE International Symposium on Mixed and Augmented Reality (ISMAR).

[38]  Tim Weyrich,et al.  Real-Time 3D Reconstruction in Dynamic Scenes Using Point-Based Fusion , 2013, 2013 International Conference on 3D Vision.

[39]  StammingerMarc,et al.  Shading-based refinement on volumetric signed distance functions , 2015 .

[40]  In-So Kweon,et al.  High Quality Shape from a Single RGB-D Image under Uncalibrated Natural Illumination , 2013, 2013 IEEE International Conference on Computer Vision.

[41]  Derek Bradley,et al.  Improved Reconstruction of Deforming Surfaces by Cancelling Ambient Occlusion , 2012, ECCV.

[42]  Sebastian Thrun,et al.  A Noise‐aware Filter for Real‐time Depth Upsampling , 2008 .

[43]  Stephen Lin,et al.  Shading-Based Shape Refinement of RGB-D Images , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[44]  Hans-Peter Seidel,et al.  Relighting objects from image collections , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[45]  Vladlen Koltun,et al.  Color map optimization for 3D reconstruction with consumer depth cameras , 2014, ACM Trans. Graph..

[46]  Berthold K. P. Horn Obtaining shape from shading information , 1989 .

[47]  John J. Leonard,et al.  Kintinuous: Spatially Extended KinectFusion , 2012, AAAI 2012.

[48]  Dieter Fox,et al.  RGB-D mapping: Using Kinect-style depth cameras for dense 3D modeling of indoor environments , 2012, Int. J. Robotics Res..

[49]  Matthias Nießner,et al.  Real-time 3D reconstruction at scale using voxel hashing , 2013, ACM Trans. Graph..

[50]  Andrew W. Fitzgibbon,et al.  KinectFusion: Real-time dense surface mapping and tracking , 2011, 2011 10th IEEE International Symposium on Mixed and Augmented Reality.

[51]  Andrew W. Fitzgibbon,et al.  Bundle Adjustment - A Modern Synthesis , 1999, Workshop on Vision Algorithms.

[52]  Richard Szeliski,et al.  A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[53]  Xi Wang,et al.  High-Resolution Stereo Datasets with Subpixel-Accurate Ground Truth , 2014, GCPR.

[54]  Hans-Peter Seidel,et al.  Shading-based dynamic shape refinement from multi-view video under general illumination , 2011, 2011 International Conference on Computer Vision.

[55]  Pat Hanrahan,et al.  A signal-processing framework for inverse rendering , 2001, SIGGRAPH.

[56]  Jeffrey B. Mulligan,et al.  Surface Determination by Photometric Ranging , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[57]  Marc Levoy,et al.  A volumetric method for building complex models from range images , 1996, SIGGRAPH.

[58]  Michael M. Kazhdan,et al.  Poisson surface reconstruction , 2006, SGP '06.

[59]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[60]  Olivier D. Faugeras,et al.  Shape from shading: a well-posed problem? , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[61]  Paul E. Debevec,et al.  Multiview face capture using polarized spherical gradient illumination , 2011, ACM Trans. Graph..