Neural Light Transport for Relighting and View Synthesis

The light transport (LT) of a scene describes how it appears under different lighting conditions from different viewing directions, and complete knowledge of a scene’s LT enables the synthesis of novel views under arbitrary lighting. In this article, we focus on image-based LT acquisition, primarily for human bodies within a light stage setup. We propose a semi-parametric approach for learning a neural representation of the LT that is embedded in a texture atlas of known but possibly rough geometry. We model all non-diffuse and global LT as residuals added to a physically based diffuse base rendering. In particular, we show how to fuse previously seen observations of illuminants and views to synthesize a new image of the same scene under a desired lighting condition from a chosen viewpoint. This strategy allows the network to learn complex material effects (such as subsurface scattering) and global illumination (such as diffuse interreflection), while guaranteeing the physical correctness of the diffuse LT (such as hard shadows). With this learned LT, one can relight the scene photorealistically with a directional light or an HDRI map, synthesize novel views with view-dependent effects, or do both simultaneously, all in a unified framework using a set of sparse observations. Qualitative and quantitative experiments demonstrate that our Neural Light Transport (NLT) outperforms state-of-the-art solutions for relighting and view synthesis, without requiring separate treatments for both problems that prior work requires. The code and data are available at http://nlt.csail.mit.edu.

[1]  Gordon Wetzstein,et al.  DeepVoxels: Learning Persistent 3D Feature Embeddings , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[3]  Patrick Pérez,et al.  Deep video portraits , 2018, ACM Trans. Graph..

[4]  Paul Debevec,et al.  Deep reflectance fields , 2019, ACM Trans. Graph..

[5]  Andrea Tagliasacchi,et al.  Volumetric Capture of Humans With a Single RGBD Camera via Semi-Parametric Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Andrew Zisserman,et al.  Multiple View Geometry in Computer Vision (2nd ed) , 2003 .

[7]  Marc Levoy,et al.  Symmetric photography: exploiting data-sparseness in reflectance fields , 2006, EGSR '06.

[8]  J. F. Murray-Coleman,et al.  The Automated Measurement of BRDFs and their Application to Luminaire Modeling , 1990 .

[9]  Neil Hunt,et al.  The triangle processor and normal vector shader: a VLSI system for high performance graphics , 1988, SIGGRAPH.

[10]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[11]  Jean-François Lalonde,et al.  Learning Physics-Guided Face Relighting Under Directional Light , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Yaser Sheikh,et al.  Deep appearance models for face rendering , 2018, ACM Trans. Graph..

[13]  Jitendra Malik,et al.  Shape, Illumination, and Reflectance from Shading , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[15]  John Flynn,et al.  DeepLight: Learning Illumination for Unconstrained Mobile Mixed Reality , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Hans-Peter Seidel,et al.  Free-viewpoint video of human actors , 2003, ACM Trans. Graph..

[17]  Justus Thies,et al.  Deferred Neural Rendering: Image Synthesis using Neural Textures , 2019 .

[18]  Yannick Hold-Geoffroy,et al.  Deep Parametric Indoor Lighting Estimation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[19]  Kalyan Sunkavalli,et al.  Learning to reconstruct shape and spatially-varying reflectance from a single image , 2018, ACM Trans. Graph..

[20]  Jonathan T. Barron,et al.  Pushing the Boundaries of View Extrapolation With Multiplane Images , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Paul E. Debevec,et al.  The relightables , 2019, ACM Trans. Graph..

[22]  Yun-Ta Tsai,et al.  Light stage super-resolution , 2020, ACM Trans. Graph..

[23]  Jiajun Wu,et al.  Multi-Plane Program Induction with 3D Box Priors , 2020, NeurIPS.

[24]  Yun-Ta Tsai,et al.  Single image portrait relighting , 2019, ACM Trans. Graph..

[25]  Greg Humphreys,et al.  Physically Based Rendering: From Theory to Implementation , 2004 .

[26]  Gordon Wetzstein,et al.  Scene Representation Networks: Continuous 3D-Structure-Aware Neural Scene Representations , 2019, NeurIPS.

[27]  John Flynn,et al.  Stereo magnification , 2018, ACM Trans. Graph..

[28]  Ravi Ramamoorthi,et al.  Local light field fusion , 2019, ACM Trans. Graph..

[29]  Jonathan T. Barron,et al.  NeRF: representing scenes as neural radiance fields for view synthesis , 2022, Commun. ACM.

[30]  E. Adelson,et al.  The Plenoptic Function and the Elements of Early Vision , 1991 .

[31]  H. Barrow,et al.  RECOVERING INTRINSIC SCENE CHARACTERISTICS FROM IMAGES , 1978 .

[32]  David E. Jacobs,et al.  Portrait shadow manipulation , 2020, ACM Trans. Graph..

[33]  Yijing Li,et al.  Multi-Resolution Modeling of Shapes in Contact , 2019, PACMCGIT.

[34]  Paul E. Debevec,et al.  Cosine Lobe Based Relighting from Gradient Illumination Photographs , 2009, 2009 Conference for Visual Media Production.

[35]  Wan-Chun Ma,et al.  AR-ia: Volumetric Opera for Mobile Augmented Reality , 2019, SIGGRAPH Asia XR.

[36]  Richard Szeliski,et al.  SynSin: End-to-End View Synthesis From a Single Image , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Pratul P. Srinivasan,et al.  NeRF , 2020, ECCV.

[38]  Kenny Mitchell,et al.  Deep Precomputed Radiance Transfer for Deformable Objects , 2019, PACMCGIT.

[39]  Christian Theobalt,et al.  StyleRig: Rigging StyleGAN for 3D Control Over Portrait Images , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Paul Debevec,et al.  DeepView: View Synthesis With Learned Gradient Descent , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  James T. Kajiya,et al.  The rendering equation , 1986, SIGGRAPH.

[42]  Charles T. Loop,et al.  Holoportation: Virtual 3D Teleportation in Real-time , 2016, UIST.

[43]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[44]  Ira Kemelmacher-Shlizerman,et al.  Photometric Stereo with General, Unknown Lighting , 2006, International Journal of Computer Vision.

[45]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[46]  Anita Sellent,et al.  Floating Textures , 2008, Comput. Graph. Forum.

[47]  I. Daubechies,et al.  Biorthogonal bases of compactly supported wavelets , 1992 .

[48]  Justus Thies,et al.  Image-guided Neural Object Rendering , 2020, ICLR.

[49]  Ashutosh Saxena,et al.  Make3D: Learning 3D Scene Structure from a Single Still Image , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[50]  Jonathan T. Barron,et al.  A General and Adaptive Robust Loss Function , 2017, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  Rob Fergus,et al.  Depth Map Prediction from a Single Image using a Multi-Scale Deep Network , 2014, NIPS.

[52]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[53]  Gordon Wetzstein,et al.  State of the Art on Neural Rendering , 2020, Comput. Graph. Forum.

[54]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[55]  Marc Levoy,et al.  Dual photography , 2005, SIGGRAPH 2005.

[56]  Justus Thies,et al.  Deferred neural rendering , 2019, ACM Trans. Graph..

[57]  Kalyan Sunkavalli,et al.  Deep image-based relighting from optimal sparse samples , 2018, ACM Trans. Graph..

[58]  Kalyan Sunkavalli,et al.  Deep view synthesis from sparse photometric images , 2019, ACM Trans. Graph..

[59]  Steven M. Seitz,et al.  Photo tourism: exploring photo collections in 3D , 2006, ACM Trans. Graph..

[60]  Kiriakos N. Kutulakos,et al.  A Neural Rendering Framework for Free-Viewpoint Relighting , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[61]  Victor Lempitsky,et al.  Textured Neural Avatars , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[62]  Pieter Peers,et al.  Rapid Acquisition of Specular and Diffuse Normal Maps from Polarized Spherical Gradient Illumination , 2007 .

[63]  Yaser Sheikh,et al.  Neural volumes , 2019, ACM Trans. Graph..

[64]  Alvaro Collet,et al.  High-quality streamable free-viewpoint video , 2015, ACM Trans. Graph..

[65]  Graham Fyffe,et al.  Stereo Magnification: Learning View Synthesis using Multiplane Images , 2018, ArXiv.

[66]  Andrew L. Maas Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .

[67]  Peiran REN,et al.  Image based relighting using neural networks , 2015, ACM Trans. Graph..

[68]  Kurt Hornik,et al.  Approximation capabilities of multilayer feedforward networks , 1991, Neural Networks.

[69]  Hans-Peter Seidel,et al.  Deep Shading: Convolutional Neural Networks for Screen Space Shading , 2016, Comput. Graph. Forum.

[70]  Richard Szeliski,et al.  The lumigraph , 1996, SIGGRAPH.

[71]  Alexei A. Efros,et al.  The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[72]  Ravi Ramamoorthi,et al.  Reflectance sharing: predicting appearance from a sparse set of images of a known shape , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[73]  Steven M. Seitz,et al.  LookinGood , 2018, ACM Trans. Graph..

[74]  Ira Kemelmacher-Shlizerman,et al.  Background Matting: The World Is Your Green Screen , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[75]  Marcus A. Magnor,et al.  Tex2Shape: Detailed Full Human Body Geometry From a Single Image , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[76]  Michael Bosse,et al.  Unstructured lumigraph rendering , 2001, SIGGRAPH.

[77]  Yoshihiro Kanamori,et al.  Relighting humans , 2018, ACM Trans. Graph..

[78]  Robert J. Woodham,et al.  Photometric method for determining surface orientation from multiple images , 1980 .

[79]  Paul Debevec,et al.  The Light Stages and Their Applications to Photoreal Digital Actors , 2012, SIGGRAPH 2012.

[80]  Paul E. Debevec,et al.  Acquiring the reflectance field of a human face , 2000, SIGGRAPH.

[81]  Marc Levoy,et al.  Light field rendering , 1996, SIGGRAPH.

[82]  Carlos D. Castillo,et al.  SfSNet: Learning Shape, Reflectance and Illuminance of Faces 'in the Wild' , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.