Neural Inverse Rendering of an Indoor Scene From a Single Image

Inverse rendering aims to estimate physical attributes of a scene, e.g., reflectance, geometry, and lighting, from image(s). Inverse rendering has been studied primarily for single objects or with methods that solve for only one of the scene attributes. We propose the first learning based approach that jointly estimates albedo, normals, and lighting of an indoor scene from a single image. Our key contribution is the Residual Appearance Renderer (RAR), which can be trained to synthesize complex appearance effects (e.g., inter-reflection, cast shadows, near-field illumination, and realistic shading), which would be neglected otherwise. This enables us to perform self-supervised learning on real data using a reconstruction loss, based on re-synthesizing the input image from the estimated components. We finetune with real data after pretraining with synthetic data. To this end, we use physically-based rendering to create a large-scale synthetic dataset, named SUNCG-PBR, which is a significant improvement over prior datasets. Experimental results show that our approach outperforms state-of-the-art methods that estimate one or more scene attributes.

[1]  Tobias Ritschel,et al.  Joint Material and Illumination Estimation from Photo Sets in the Wild , 2017, 2018 International Conference on 3D Vision (3DV).

[2]  G. Stiny Shape , 1999 .

[3]  Alexei A. Efros,et al.  Learning Data-Driven Reflectance Priors for Intrinsic Image Decomposition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[4]  David A. Forsyth,et al.  Rendering synthetic objects into legacy photographs , 2011, ACM Trans. Graph..

[5]  Carlos D. Castillo,et al.  SfSNet: Learning Shape, Reflectance and Illuminance of Faces 'in the Wild' , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[6]  Eric P. Lafortune,et al.  Using the modified Phong reflectance model for physically based rendering , 1994 .

[7]  Ko Nishino,et al.  Shape and Reflectance from Natural Illumination , 2012, ECCV.

[8]  Kalyan Sunkavalli,et al.  Materials for Masses: SVBRDF Acquisition with a Single Mobile Phone Image , 2018, ECCV.

[9]  Kalyan Sunkavalli,et al.  Deep image-based relighting from optimal sparse samples , 2018, ACM Trans. Graph..

[10]  Jian Shi,et al.  Learning Non-Lambertian Object Intrinsics Across ShapeNet Categories , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Hans-Peter Seidel,et al.  LIME: Live Intrinsic Material Estimation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[12]  Ersin Yumer,et al.  Physically-Based Rendering for Indoor Scene Understanding Using Convolutional Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Peter V. Gehler,et al.  Reflectance Adaptive Filtering Improves Intrinsic Image Estimation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Matthias Nießner,et al.  ScanNet: Richly-Annotated 3D Reconstructions of Indoor Scenes , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Jonathan T. Barron,et al.  Scene Intrinsics and Depth from a Single Image , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[16]  Yaser Yacoob,et al.  Label Denoising Adversarial Network (LDAN) for Inverse Lighting of Faces , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[17]  Takanori Maehara,et al.  Neural Inverse Rendering for General Reflectance Photometric Stereo , 2018, ICML.

[18]  Ersin Yumer,et al.  Learning to predict indoor illumination from a single image , 2017, ACM Trans. Graph..

[19]  Ye Yu,et al.  InverseRenderNet: Learning Single Image Inverse Rendering , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Ersin Yumer,et al.  Material Editing Using a Physically Based Rendering Network , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[21]  Ersin Yumer,et al.  Neural Face Editing with Intrinsic Image Disentangling , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[23]  Noah Snavely,et al.  Intrinsic images in the wild , 2014, ACM Trans. Graph..

[24]  Noah Snavely,et al.  OpenSurfaces , 2013, ACM Trans. Graph..

[25]  Andrew W. Fitzgibbon,et al.  Scene Coordinate Regression Forests for Camera Relocalization in RGB-D Images , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Ko Nishino,et al.  Reflectance and Illumination Recovery in the Wild , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Olivier D. Faugeras,et al.  Shape From Shading , 2006, Handbook of Mathematical Models in Computer Vision.

[28]  Luc Van Gool,et al.  DARN: a Deep Adversial Residual Network for Intrinsic Image Decomposition , 2016, ArXiv.

[29]  Shuang Zhao,et al.  Inverse Transport Networks , 2018, ArXiv.

[30]  Michael F. Cohen,et al.  Discovering Point Lights with Intensity Distance Fields , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[31]  Tatsuya Harada,et al.  Neural 3D Mesh Renderer , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[32]  Michael F. Cohen,et al.  Emptying, refurnishing, and relighting indoor spaces , 2016, ACM Trans. Graph..

[33]  Kalyan Sunkavalli,et al.  Learning to reconstruct shape and spatially-varying reflectance from a single image , 2018, ACM Trans. Graph..

[34]  Zhengqi Li,et al.  Learning Intrinsic Image Decomposition from Watching the World , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[35]  Jaakko Lehtinen,et al.  Differentiable Monte Carlo ray tracing through edge sampling , 2018, ACM Trans. Graph..

[36]  Luc Van Gool,et al.  DeLight-Net: Decomposing Reflectance Maps into Specular Materials and Natural Illumination , 2016, ArXiv.

[37]  Dacheng Tao,et al.  Deep Ordinal Regression Network for Monocular Depth Estimation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[38]  Timo Aila,et al.  Interactive reconstruction of Monte Carlo image sequences using a recurrent denoising autoencoder , 2017, ACM Trans. Graph..

[39]  Derek Hoiem,et al.  Indoor Segmentation and Support Inference from RGBD Images , 2012, ECCV.

[40]  Matthias Nießner,et al.  A Lightweight Approach for On-the-Fly Reflectance Estimation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[41]  Zhengqi Li,et al.  CGIntrinsics: Better Intrinsic Image Decomposition through Physically-Based Rendering , 2018, ECCV.

[42]  Xuming He,et al.  Indoor scene structure analysis for single image depth estimation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Nikos Paragios,et al.  Handbook of Mathematical Models in Computer Vision , 2005 .

[44]  Stella X. Yu,et al.  Direct Intrinsics: Learning Albedo-Shading Decomposition by Convolutional Regression , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[45]  Yannick Hold-Geoffroy,et al.  Deep Outdoor Illumination Estimation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Jitendra Malik,et al.  Shape, Illumination, and Reflectance from Shading , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[47]  Thomas A. Funkhouser,et al.  Semantic Scene Completion from a Single Depth Image , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  Erik Reinhard,et al.  Image-based material editing , 2005, SIGGRAPH '05.

[49]  Rob Fergus,et al.  Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-scale Convolutional Architecture , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[50]  Jean-François Lalonde,et al.  Learning to Estimate Indoor Lighting from 3D Objects , 2018, 2018 International Conference on 3D Vision (3DV).

[51]  Edward H. Adelson,et al.  Recovering intrinsic images from a single image , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.