Learning Signed Distance Field for Multi-view Surface Reconstruction

Recent works on implicit neural representations have shown promising results for multi-view surface reconstruction. However, most approaches are limited to relatively simple geometries and usually require clean object masks for reconstructing complex and concave objects. In this work, we introduce a novel neural surface reconstruction framework that leverages the knowledge of stereo matching and feature consistency to optimize the implicit surface representation. More specifically, we apply a signed distance field (SDF) and a surface light field to represent the scene geometry and appearance respectively. The SDF is directly supervised by geometry from stereo matching, and is refined by optimizing the multi-view feature consistency and the fidelity of rendered images. Our method is able to improve the robustness of geometry estimation and support reconstruction of complex scene topologies. Extensive experiments have been conducted on DTU, EPFL and Tanks and Temples datasets. Compared to previous state-of-theart methods, our method achieves better mesh reconstruction in wide open scenes without masks as input.

[1]  Jean Ponce,et al.  Accurate, Dense, and Robust Multiview Stereopsis , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Marc Pollefeys,et al.  Photometric Bundle Adjustment for Dense Multi-view 3D Modeling , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Hao Li,et al.  PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[4]  Marc Pollefeys,et al.  Convolutional Occupancy Networks , 2020, ECCV.

[5]  Jan-Michael Frahm,et al.  Pixelwise View Selection for Unstructured Multi-View Stereo , 2016, ECCV.

[6]  Mattia Rossi,et al.  DeepC-MVS: Deep Confidence Prediction for Multi-View Stereo Reconstruction , 2020, 2020 International Conference on 3D Vision (3DV).

[7]  Jitendra Malik,et al.  Learning a Multi-View Stereo Machine , 2017, NIPS.

[8]  Sebastian Nowozin,et al.  Occupancy Networks: Learning 3D Reconstruction in Function Space , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Carlos Hernandez,et al.  Multi-View Stereo: A Tutorial , 2015, Found. Trends Comput. Graph. Vis..

[10]  Adrien Gaidon,et al.  Differentiable Rendering: A Survey , 2020, ArXiv.

[11]  Konrad Schindler,et al.  Massively Parallel Multiview Stereopsis by Surface Normal Diffusion , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[12]  Henrik Aanæs,et al.  Large Scale Multi-view Stereopsis Evaluation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Silvano Galliani,et al.  PatchmatchNet: Learned Multi-View Patchmatch Stereo , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Vladlen Koltun,et al.  Open3D: A Modern Library for 3D Data Processing , 2018, ArXiv.

[15]  Pascal Fua,et al.  On benchmarking camera calibration and multi-view stereo for high resolution imagery , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Wenbing Tao,et al.  Multi-Scale Geometric Consistency Guided Multi-View Stereo , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Gordon Wetzstein,et al.  Scene Representation Networks: Continuous 3D-Structure-Aware Neural Scene Representations , 2019, NeurIPS.

[18]  Kyaw Zaw Lin,et al.  Neural Sparse Voxel Fields , 2020, NeurIPS.

[19]  Ting Zhao,et al.  Pyramid Feature Attention Network for Saliency Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Carsten Rother,et al.  PatchMatch Stereo - Stereo Matching with Slanted Support Windows , 2011, BMVC.

[21]  Luc Van Gool,et al.  RayNet: Learning Volumetric 3D Reconstruction with Ray Potentials , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[22]  Narendra Ahuja,et al.  DeepMVS: Learning Multi-view Stereopsis , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[23]  Jean Ponce,et al.  Carved Visual Hulls for Image-Based Modeling , 2006, International Journal of Computer Vision.

[24]  Shiwei Li,et al.  Visibility-aware Multi-view Stereo Network , 2020, BMVC.

[25]  Roberto Cipolla,et al.  Using Multiple Hypotheses to Improve Depth-Maps for Multi-View Stereo , 2008, ECCV.

[26]  Michael M. Kazhdan,et al.  Screened poisson surface reconstruction , 2013, TOGS.

[27]  Long Quan,et al.  A quasi-dense approach to surface reconstruction from uncalibrated images , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Andreas Geiger,et al.  Differentiable Volumetric Rendering: Learning Implicit 3D Representations Without 3D Supervision , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Jean-Philippe Pons,et al.  Efficient Multi-View Reconstruction of Large-Scale Scenes using Interest Points, Delaunay Triangulation and Graph Cuts , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[30]  Yinda Zhang,et al.  DIST: Rendering Deep Implicit Signed Distance Function With Differentiable Sphere Tracing , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Kai Zhang,et al.  NeRF++: Analyzing and Improving Neural Radiance Fields , 2020, ArXiv.

[32]  Zehao Yu,et al.  Fast-MVSNet: Sparse-to-Dense Multi-View Stereo With Learned Propagation and Gauss-Newton Refinement , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Jean-Philippe Pons,et al.  High Accuracy and Visibility-Consistent Dense Multiview Stereo , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Richard Szeliski,et al.  A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[35]  Richard A. Newcombe,et al.  DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Olivier D. Faugeras,et al.  Multi-View Stereo Reconstruction and Scene Flow Estimation with a Global Image-Based Matching Score , 2007, International Journal of Computer Vision.

[37]  Long Quan,et al.  Recurrent MVSNet for High-Resolution Multi-View Stereo Depth Inference , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Jiansheng Chen,et al.  MVSCRF: Learning Multi-View Stereo With Conditional Random Fields , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[39]  Yaron Lipman,et al.  Implicit Geometric Regularization for Learning Shapes , 2020, ICML.

[40]  William E. Lorensen,et al.  Marching cubes: A high resolution 3D surface construction algorithm , 1987, SIGGRAPH.

[41]  Emmanuel Prados,et al.  Gradient Flows for Optimizing Triangular Mesh-based Surfaces: Applications to 3D Reconstruction Problems Dealing with Visibility , 2011, International Journal of Computer Vision.

[42]  Jonathan T. Barron,et al.  NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Pratul P. Srinivasan,et al.  NeRF , 2020, ECCV.

[44]  Ronen Basri,et al.  Multiview Neural Surface Reconstruction by Disentangling Geometry and Appearance , 2020, NeurIPS.

[45]  Long Quan,et al.  Efficient Multi-view Surface Refinement with Adaptive Resolution Control , 2016, ECCV.

[46]  Long Quan,et al.  MVSNet: Depth Inference for Unstructured Multi-view Stereo , 2018, ECCV.

[47]  Hao Li,et al.  Learning to Infer Implicit Surfaces without 3D Supervision , 2019, NeurIPS.

[48]  Pascal Fua,et al.  Efficient large-scale multi-view stereo for ultra high-resolution image sets , 2011, Machine Vision and Applications.

[49]  Long Quan,et al.  BlendedMVS: A Large-Scale Dataset for Generalized Multi-View Stereo Networks , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[50]  Robert T. Collins,et al.  A space-sweep approach to true multi-image matching , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.