DeepLight: Learning Illumination for Unconstrained Mobile Mixed Reality

We present a learning-based method to infer plausible high dynamic range (HDR), omnidirectional illumination given an unconstrained, low dynamic range (LDR) image from a mobile phone camera with a limited field of view (FOV). For training data, we collect videos of various reflective spheres placed within the camera's FOV, leaving most of the background unoccluded, leveraging that materials with diverse reflectance functions reveal different lighting cues in a single exposure. We train a deep neural network to regress from the LDR background image to HDR lighting by matching the LDR ground truth sphere images to those rendered with the predicted illumination using image-based relighting, which is differentiable. Our inference runs at interactive frame rates on a mobile device, enabling realistic rendering of virtual objects into real scenes for mobile mixed reality. Training on automatically exposed and white-balanced videos, we improve the realism of rendered objects compared to the state-of-the art methods for both indoor and outdoor scenes.

[1]  M. Zollhöfer,et al.  Self-Supervised Multi-level Face Model Learning for Monocular Reconstruction at Over 250 Hz , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[2]  Yaser Yacoob,et al.  Label Denoising Adversarial Network (LDAN) for Inverse Lighting of Faces , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[3]  Sylvia C. Pont,et al.  A comparison of material and illumination discrimination performance for real rough, real smooth and computer generated smooth spheres , 2005, APGV '05.

[4]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[5]  Paul Debevec,et al.  Inverse global illumination: Recovering re?ectance models of real scenes from photographs , 1998 .

[6]  Alexei A. Efros,et al.  Context Encoders: Feature Learning by Inpainting , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Jan Kautz,et al.  Precomputed radiance transfer for real-time rendering in dynamic, low-frequency lighting environments , 2002 .

[8]  Yannick Hold-Geoffroy,et al.  Deep Outdoor Illumination Estimation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Carlos D. Castillo,et al.  SfSNet: Learning Shape, Reflectance and Illuminance of Faces 'in the Wild' , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[10]  Erik Reinhard,et al.  Multiple Light Source Estimation in a Single Image , 2013, Comput. Graph. Forum.

[11]  Erik Reinhard,et al.  Image-based material editing , 2005, SIGGRAPH '05.

[12]  H. Barrow,et al.  RECOVERING INTRINSIC SCENE CHARACTERISTICS FROM IMAGES , 1978 .

[13]  Silvio Savarese,et al.  Im2Pano3D: Extrapolating 360° Structure and Semantics Beyond the Field of View , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[14]  Sepp Hochreiter,et al.  Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) , 2015, ICLR.

[15]  Jinsong Zhang,et al.  Learning High Dynamic Range from Outdoor Panoramas , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[16]  Jean-François Lalonde,et al.  Learning to Estimate Indoor Lighting from 3D Objects , 2018, 2018 International Conference on 3D Vision (3DV).

[17]  S. Ullman The interpretation of structure from motion , 1979, Proceedings of the Royal Society of London. Series B. Biological Sciences.

[18]  S. Pont,et al.  Material — Illumination Ambiguities and the Perception of Solid Objects , 2006, Perception.

[19]  Ko Nishino,et al.  Reflectance and Illumination Recovery in the Wild , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Vincent Lepetit,et al.  Learning Lightprobes for Mixed Reality Illumination , 2017, 2017 IEEE International Symposium on Mixed and Augmented Reality (ISMAR).

[21]  Kalyan Sunkavalli,et al.  Automatic Scene Inference for 3D Object Compositing , 2014, ACM Trans. Graph..

[22]  Ron O Dror,et al.  Statistical characterization of real-world illumination. , 2004, Journal of vision.

[23]  Paul Debevec Rendering synthetic objects into real scenes: bridging traditional and image-based graphics with global illumination and high dynamic range photography , 2008, SIGGRAPH Classes.

[24]  James F. Blinn,et al.  Texture and reflection in computer generated images , 1998 .

[25]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[26]  Krista A. Ehinger,et al.  Recognizing scene viewpoint using panoramic place representation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Kenny Mitchell,et al.  From Faces to Outdoor Light Probes , 2018, Comput. Graph. Forum.

[28]  Ersin Yumer,et al.  Learning to predict indoor illumination from a single image , 2017, ACM Trans. Graph..

[29]  Hiroshi Ishikawa,et al.  Globally and locally consistent image completion , 2017, ACM Trans. Graph..

[30]  Ersin Yumer,et al.  Material Editing Using a Physically Based Rendering Network , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[31]  Alexei A. Efros,et al.  Estimating natural illumination from a single outdoor image , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[32]  Ersin Yumer,et al.  Neural Face Editing with Intrinsic Image Disentangling , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Christian Ledig,et al.  Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Jitendra Malik,et al.  Intrinsic Scene Properties from a Single RGB-D Image , 2013, CVPR.

[35]  Tobias Ritschel,et al.  Joint Material and Illumination Estimation from Photo Sets in the Wild , 2017, 2018 International Conference on 3D Vision (3DV).

[36]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[37]  Julie Dorsey,et al.  Effic ient Re-rendering of Naturally Illuminated Environments , 1994 .

[38]  Anders Ynnerman,et al.  Densely sampled light probe sequences for spatially variant image based lighting , 2006, GRAPHITE '06.

[39]  Patrick Pérez,et al.  MoFA: Model-Based Deep Convolutional Face Autoencoder for Unsupervised Monocular Reconstruction , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[40]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[41]  Jian Shi,et al.  Learning Scene Illumination by Pairwise Photos from Rear and Front Mobile Cameras , 2018, Comput. Graph. Forum.

[42]  Kalyan Sunkavalli,et al.  Deep image-based relighting from optimal sparse samples , 2018, ACM Trans. Graph..

[43]  Dieter Schmalstieg,et al.  Real-time photometric registration from arbitrary geometry , 2012, 2012 IEEE International Symposium on Mixed and Augmented Reality (ISMAR).

[44]  Hans-Peter Seidel,et al.  LIME: Live Intrinsic Material Estimation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[45]  Erik Reinhard,et al.  High Dynamic Range Imaging: Acquisition, Display, and Image-Based Lighting , 2010 .

[46]  Mario Fritz,et al.  Deep Reflectance Maps , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Paul Graham,et al.  A single-shot light probe , 2012, SIGGRAPH '12.

[48]  Graham D. Finlayson,et al.  Re-evaluating colour constancy algorithms , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[49]  Jean-François Lalonde,et al.  Lighting Estimation in Outdoor Image Collections , 2014, 2014 2nd International Conference on 3D Vision.

[50]  Luc Van Gool,et al.  DeLight-Net: Decomposing Reflectance Maps into Specular Materials and Natural Illumination , 2016, ArXiv.

[51]  Mark Sandler,et al.  MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[52]  G. Buchsbaum A spatial processor model for object colour perception , 1980 .

[53]  Thomas S. Huang,et al.  Generative Image Inpainting with Contextual Attention , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[54]  Pat Hanrahan,et al.  A signal-processing framework for inverse rendering , 2001, SIGGRAPH.

[55]  Pieter Peers,et al.  Rapid Acquisition of Specular and Diffuse Normal Maps from Polarized Spherical Gradient Illumination , 2007 .

[56]  Hao Li,et al.  High-Resolution Image Inpainting Using Multi-scale Neural Patch Synthesis , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[57]  Paul E. Debevec,et al.  Acquiring the reflectance field of a human face , 2000, SIGGRAPH.