One Ring to Rule Them All: a simple solution to multi-view 3D-Reconstruction of shapes with unknown BRDF via a small Recurrent ResNet

This paper proposes a simple method which solves an open problem of multi-view 3D-Reconstruction for objects with unknown and generic surface materials, imaged by a freely moving camera and a freely moving point light source. The object can have arbitrary (e.g. nonLambertian), spatially-varying (or everywhere different) surface reflectances (svBRDF). Our solution consists of two smallsized neural networks (dubbed the ‘Shape-Net’ and ‘BRDFNet’), each having about 1,000 neurons, used to parameterize the unknown shape and unknown svBRDF, respectively. Key to our method is a special network design (namely, a ResNet with a global feedback or ‘ring’ connection), which has a provable guarantee for finding a valid diffeomorphic shape parameterization. Despite the underlying problem is highly non-convex hence impractical to solve by traditional optimization techniques, our method converges reliably to high quality solutions, even without initialization. Extensive experiments demonstrate the superiority of our method, and it naturally enables a wide range of special-effect applications including novel-view-synthesis, relighting, material retouching, and shape exchange without additional coding effort. We encourage the reader to view our demo video for better visualizations.

[1]  Jan-Michael Frahm,et al.  Structure-from-Motion Revisited , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Ko Nishino,et al.  Shape and Reflectance from Natural Illumination , 2012, ECCV.

[3]  Wei Liu,et al.  Pixel2Mesh: Generating 3D Mesh Models from Single RGB Images , 2018, ECCV.

[4]  Matthias Zwicker,et al.  SDFDiff: Differentiable Rendering of Signed Distance Fields for 3D Shape Optimization , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Katerina Fragkiadaki,et al.  Adversarial Inverse Graphics Networks: Learning 2D-to-3D Lifting and Image-to-Image Translation from Unpaired Supervision , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[6]  Ko Nishino Directional statistics BRDF model , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[7]  Ye Yu,et al.  InverseRenderNet: Learning Single Image Inverse Rendering , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Andreas Geiger,et al.  Differentiable Volumetric Rendering: Learning Implicit 3D Representations Without 3D Supervision , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Jaakko Lehtinen,et al.  Learning to Predict 3D Objects with an Interpolation-based Differentiable Renderer , 2019, NeurIPS.

[10]  David Duvenaud,et al.  Neural Ordinary Differential Equations , 2018, NeurIPS.

[11]  Sebastian Nowozin,et al.  Occupancy Networks: Learning 3D Reconstruction in Function Space , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Robert L. Cook,et al.  A Reflectance Model for Computer Graphics , 1987, TOGS.

[13]  Ajay Kumar,et al.  Numerical Reflectance Compensation for Non-Lambertian Photometric Stereo , 2019, IEEE Transactions on Image Processing.

[14]  Mathieu Aubry,et al.  A Papier-Mache Approach to Learning 3D Surface Generation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[15]  Giljoo Nam,et al.  Progressive Acquisition of SVBRDF and Shape in Motion , 2020, Comput. Graph. Forum.

[16]  Yasuyuki Matsushita,et al.  Robust Multiview Photometric Stereo Using Planar Mesh Parameterization , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Nematollah Batmanghelich,et al.  Deep Diffeomorphic Normalizing Flows , 2018, ArXiv.

[18]  Jannik Boll Nielsen,et al.  On optimal, minimal BRDF sampling for reflectance acquisition , 2015, ACM Trans. Graph..

[19]  Zhe Chen,et al.  Invertible Neural BRDF for Object Inverse Rendering , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Yinda Zhang,et al.  Pixel2Mesh++: Multi-View 3D Mesh Generation via Deformation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[21]  Kalyan Sunkavalli,et al.  Deep 3D Capture: Geometry and Reflectance From Sparse Multi-View Images , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Bohua Zhan,et al.  Smooth Manifolds , 2021, Arch. Formal Proofs.

[23]  Kiriakos N. Kutulakos,et al.  Photometric Stereo via Discrete Hypothesis-and-Test Search , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Eric Hand,et al.  Lord of the rings. , 2017, Science.

[25]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[26]  Henrik Aanæs,et al.  Large Scale Multi-view Stereopsis Evaluation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Pushmeet Kohli,et al.  Vision-as-Inverse-Graphics: Obtaining a Rich 3D Explanation of a Scene from a Single Image , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[28]  Jitendra Malik,et al.  Mesh R-CNN , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[29]  Roberto Cipolla,et al.  A Differential Volumetric Approach to Multi-View Photometric Stereo , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[30]  Cordelia Schmid,et al.  SfM-Net: Learning of Structure and Motion from Video , 2017, ArXiv.

[31]  Gernot Riegler,et al.  On Joint Estimation of Pose, Geometry and svBRDF From a Handheld Scanner , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Michael I. Miller,et al.  Landmark matching via large deformation diffeomorphisms , 2000, IEEE Trans. Image Process..

[33]  Carl Olsson,et al.  Combining Depth Fusion and Photometric Stereo for Fine-Detailed 3D Models , 2019, SCIA.

[34]  Giljoo Nam,et al.  Practical SVBRDF acquisition of 3D objects with unstructured flash photography , 2018, ACM Trans. Graph..

[35]  Daniel Cohen-Or,et al.  Pix2Vex: Image-to-Geometry Reconstruction using a Smooth Differentiable Renderer , 2019, ArXiv.

[36]  Yasuyuki Matsushita,et al.  A hand-held photometric stereo camera for 3-D modeling , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[37]  Ronan Fablet,et al.  Residual Networks as Flows of Diffeomorphisms , 2019, Journal of Mathematical Imaging and Vision.

[38]  Richard Szeliski,et al.  Building Rome in a day , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[39]  Konrad Schindler,et al.  Massively Parallel Multiview Stereopsis by Surface Normal Diffusion , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[40]  Leonidas J. Guibas,et al.  ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[41]  Andrea Vedaldi,et al.  Unsupervised Learning of Probably Symmetric Deformable 3D Objects From Images in the Wild , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Richard A. Newcombe,et al.  DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Steve Marschner,et al.  Microfacet Models for Refraction through Rough Surfaces , 2007, Rendering Techniques.

[45]  Manmohan Chandraker,et al.  Neural Mesh Flow: 3D Manifold Mesh Generationvia Diffeomorphic Flows , 2020, NeurIPS.

[46]  Hao Li,et al.  Soft Rasterizer: A Differentiable Renderer for Image-Based 3D Reasoning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[47]  Matthias Nießner,et al.  Intrinsic3D: High-Quality 3D Reconstruction by Joint Appearance and Geometry Optimization with Spatially-Varying Lighting , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[48]  Dong Tian,et al.  FoldingNet: Point Cloud Auto-Encoder via Deep Grid Deformation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[49]  Jean Ponce,et al.  Accurate, Dense, and Robust Multiview Stereopsis , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[50]  Marc Pollefeys,et al.  Photometric Bundle Adjustment for Dense Multi-view 3D Modeling , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[51]  Yoshua Bengio,et al.  On the Spectral Bias of Neural Networks , 2018, ICML.

[52]  Qionghai Dai,et al.  Fusing Multiview and Photometric Stereo for 3D Reconstruction under Uncalibrated Illumination , 2011, IEEE Transactions on Visualization and Computer Graphics.

[53]  Pieter Peers,et al.  Dynamic shape capture using multi-view photometric stereo , 2009, ACM Trans. Graph..

[54]  Radomír Mech,et al.  3DN: 3D Deformation Network , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[55]  Jiajun Wu,et al.  MarrNet: 3D Shape Reconstruction via 2.5D Sketches , 2017, NIPS.

[56]  Pratul P. Srinivasan,et al.  NeRF , 2020, ECCV.

[57]  Joshua B. Tenenbaum,et al.  Deep Convolutional Inverse Graphics Network , 2015, NIPS.

[58]  Ronen Basri,et al.  Multiview Neural Surface Reconstruction by Disentangling Geometry and Appearance , 2020, NeurIPS.

[59]  Boxin Shi,et al.  Multi-View Photometric Stereo: A Robust Solution and Benchmark Dataset for Spatially Varying Isotropic Materials , 2020, IEEE Transactions on Image Processing.