Learning to reconstruct shape and spatially-varying reflectance from a single image

Reconstructing shape and reflectance properties from images is a highly under-constrained problem, and has previously been addressed by using specialized hardware to capture calibrated data or by assuming known (or highly constrained) shape or reflectance. In contrast, we demonstrate that we can recover non-Lambertian, spatially-varying BRDFs and complex geometry belonging to any arbitrary shape class, from a single RGB image captured under a combination of unknown environment illumination and flash lighting. We achieve this by training a deep neural network to regress shape and reflectance from the image. Our network is able to address this problem because of three novel contributions: first, we build a large-scale dataset of procedurally generated shapes and real-world complex SVBRDFs that approximate real world appearance well. Second, single image inverse rendering requires reasoning at multiple scales, and we propose a cascade network structure that allows this in a tractable manner. Finally, we incorporate an in-network rendering layer that aids the reconstruction task by handling global illumination effects that are important for real-world scenes. Together, these contributions allow us to tackle the entire inverse rendering problem in a holistic manner and produce state-of-the-art results on both synthetic and real data.

[1]  Kun Zhou,et al.  AppFusion: Interactive Appearance Acquisition Using a Kinect Sensor , 2015, Comput. Graph. Forum.

[2]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Jaakko Lehtinen,et al.  Two-shot SVBRDF capture for stationary materials , 2015, ACM Trans. Graph..

[4]  Pieter Peers,et al.  Mobile Surface Reflectometry , 2014, SIGGRAPH '14.

[5]  Rob Fergus,et al.  Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-scale Convolutional Architecture , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[6]  Steve Marschner,et al.  Image-Based BRDF Measurement Including Human Skin , 1999, Rendering Techniques.

[7]  Tim Weyrich,et al.  Decomposing Single Images for Layered Photo Retouching , 2017, Comput. Graph. Forum.

[8]  Ersin Yumer,et al.  Material Editing Using a Physically Based Rendering Network , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[9]  Takeo Kanade,et al.  Shape from interreflections , 2004, International Journal of Computer Vision.

[10]  Kalyan Sunkavalli,et al.  Materials for Masses: SVBRDF Acquisition with a Single Mobile Phone Image , 2018, ECCV.

[11]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[12]  Yannick Hold-Geoffroy,et al.  Deep Outdoor Illumination Estimation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Jitendra Malik,et al.  Shape, Illumination, and Reflectance from Shading , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Mario Fritz,et al.  Deep Reflectance Maps , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Zoran Obradovic,et al.  Continuous Conditional Random Fields for Efficient Regression in Large Fully Connected Graphs , 2013, AAAI.

[16]  Matthew O'Toole,et al.  Optical computing for fast light transport analysis , 2010, ACM Trans. Graph..

[17]  Jonathan T. Barron,et al.  The Fast Bilateral Solver , 2015, ECCV.

[18]  Patrick Pérez,et al.  MoFA: Model-Based Deep Convolutional Face Autoencoder for Unsupervised Monocular Reconstruction , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[19]  Pat Hanrahan,et al.  An efficient representation for irradiance environment maps , 2001, SIGGRAPH.

[20]  Matthew Turk,et al.  A Morphable Model For The Synthesis Of 3D Faces , 1999, SIGGRAPH.

[21]  Paul E. Debevec,et al.  Acquiring the reflectance field of a human face , 2000, SIGGRAPH.

[22]  Varun Ramakrishna,et al.  Convolutional Pose Machines , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Szymon Rusinkiewicz,et al.  Efficiently combining positions and normals for precise 3D geometry , 2005, ACM Trans. Graph..

[24]  Adrien Bousseau,et al.  Single-image SVBRDF capture with a rendering-aware deep network , 2018, ACM Trans. Graph..

[25]  Ko Nishino,et al.  Shape and Reflectance Estimation in the Wild , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Aswin C. Sankaranarayanan,et al.  Shape and Spatially-Varying Reflectance Estimation from Virtual Exemplars , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Jian Shi,et al.  Learning Non-Lambertian Object Intrinsics Across ShapeNet Categories , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Jian Wang,et al.  Reflectance Capture Using Univariate Sampling of BRDFs , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[29]  Jia Deng,et al.  Stacked Hourglass Networks for Human Pose Estimation , 2016, ECCV.

[30]  Thomas Brox,et al.  FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Abhinav Gupta,et al.  Marr Revisited: 2D-3D Alignment via Surface Normal Prediction , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  David J. Kriegman,et al.  Reflections on the generalized bas-relief ambiguity , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[33]  Leonidas J. Guibas,et al.  ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[34]  Edward H. Adelson,et al.  Shape estimation in natural illumination , 2011, CVPR 2011.

[35]  Steven M. Seitz,et al.  Shape and spatially-varying BRDFs from photometric stereo , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[36]  Ramesh Raskar,et al.  Fast separation of direct and global components of a scene using high frequency illumination , 2006, SIGGRAPH 2006.

[37]  Wojciech Matusik,et al.  A data-driven reflectance model , 2003, ACM Trans. Graph..

[38]  Kalyan Sunkavalli,et al.  Deep image-based relighting from optimal sparse samples , 2018, ACM Trans. Graph..

[39]  Hans-Peter Seidel,et al.  LIME: Live Intrinsic Material Estimation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[40]  Jaakko Lehtinen,et al.  Reflectance modeling by neural texture synthesis , 2016, ACM Trans. Graph..

[41]  Xiao Li,et al.  Modeling surface appearance from a single photograph using self-augmented convolutional neural networks , 2017, ACM Trans. Graph..

[42]  Paul Debevec,et al.  Inverse global illumination: Recovering re?ectance models of real scenes from photographs , 1998 .

[43]  Noah Snavely,et al.  Material recognition in the wild with the Materials in Context Database , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Brian Karis,et al.  Real Shading in Unreal Engine 4 by , 2013 .

[45]  Robert J. Woodham,et al.  Photometric method for determining surface orientation from multiple images , 1980 .

[46]  Hans-Peter Seidel,et al.  Deep Shading: Convolutional Neural Networks for Screen Space Shading , 2016, Comput. Graph. Forum.

[47]  Ersin Yumer,et al.  Learning to predict indoor illumination from a single image , 2017, ACM Trans. Graph..

[48]  Ersin Yumer,et al.  Neural Face Editing with Intrinsic Image Disentangling , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[49]  Christian Szegedy,et al.  DeepPose: Human Pose Estimation via Deep Neural Networks , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[50]  Alexei A. Efros,et al.  SVBRDF-Invariant Shape and Reflectance Estimation from Light-Field Cameras , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  Manmohan Krishna Chandraker,et al.  On Shape and Material Recovery from Motion , 2014, ECCV.

[52]  Peter Hedman,et al.  Multi-view Reconstruction of Highly Specular Surfaces in Uncontrolled Environments , 2015, 2015 International Conference on 3D Vision.

[53]  Michael F. Cohen,et al.  Radiosity and realistic image synthesis , 1993 .

[54]  Carlos D. Castillo,et al.  SfSNet: Learning Shape, Reflectance and Illuminance of Faces 'in the Wild' , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[55]  Zhengqin Li,et al.  Robust Energy Minimization for BRDF-Invariant Shape from Light Fields , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[56]  Luc Van Gool,et al.  What is Around the Camera? , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[57]  Min H. Kim,et al.  DeepToF: off-the-shelf real-time correction of multipath interference in time-of-flight imaging , 2017, ACM Trans. Graph..