Learning to Relight Portrait Images via a Virtual Light Stage and Synthetic-to-Real Adaptation

Given a portrait image of a person and an environment map of the target lighting, portrait relighting aims to re-illuminate the person in the image as if the person appeared in an environment with the target lighting. To achieve high-quality results, recent methods rely on deep learning. An effective approach is to supervise the training of deep neural networks with a high-fidelity dataset of desired input-output pairs, captured with a light stage. However, acquiring such data requires an expensive special capture rig and time-consuming efforts, limiting access to only a few resourceful laboratories. To address the limitation, we propose a new approach that can perform on par with the state-of-the-art (SOTA) relighting methods without requiring a light stage. Our approach is based on the realization that a successful relighting of a portrait image depends on two conditions. First, the method needs to mimic the behaviors of physically-based relighting. Second, the output has to be photorealistic. To meet the first condition, we propose to train the relighting network with training data generated by a virtual light stage that performs physically-based rendering on various 3D synthetic humans under different environment maps. To meet the second condition, we develop a novel synthetic-to-real approach to bring photorealism to the relighting network output. In addition to achieving SOTA results, our approach offers several advantages over the prior methods, including controllable glares on glasses and more temporally-consistent results for relighting videos.

[1]  Stephan J. Garbin,et al.  3D face reconstruction with dense landmarks , 2022, ECCV.

[2]  M. Sarkis,et al.  Face Relighting with Geometrically Consistent Shadows , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Daichi Tajima,et al.  Relighting Humans in the Wild: Monocular Full‐Body Human Relighting with Domain Adaptation , 2021, Comput. Graph. Forum.

[4]  Thomas J. Cashman,et al.  Fake it till you make it: face analysis in the wild using synthetic data alone , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[5]  Simone Calderara,et al.  MOTSynth: How Can Synthetic Data Help Pedestrian Detection and Tracking? , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[6]  Ravi Ramamoorthi,et al.  NeLF: Neural Light-transport Field for Portrait View Synthesis and Relighting , 2021, EGSR.

[7]  S. Palazzo,et al.  SurfaceNet: Adversarial SVBRDF Estimation from a Single Image , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[8]  Xin Sun,et al.  Single-image Full-body Human Relighting , 2021, EGSR.

[9]  George Drettakis,et al.  Free-viewpoint Indoor Neural Relighting from Multi-view Stereo , 2021, ACM Trans. Graph..

[10]  Ira Kemelmacher-Shlizerman,et al.  A Light Stage on Every Desk , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[11]  Vladlen Koltun,et al.  Enhancing Photorealism Enhancement , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Michel Sarkis,et al.  Towards High Fidelity Face Relighting with Realistic Shadows , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Jingyi Yu,et al.  Neural Video Portrait Relighting in Real-time via Consistency Modeling , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[14]  Gim Hee Lee,et al.  From Synthetic to Real: Unsupervised Domain Adaptation for Animal Pose Estimation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Hans-Peter Seidel,et al.  PhotoApp , 2021, ACM Trans. Graph..

[16]  Louis Chevallier,et al.  Practical Face Reconstruction via Differentiable Ray Tracing , 2021, Comput. Graph. Forum.

[17]  Arun Mallya,et al.  One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Quan Wang,et al.  Single image portrait relighting via explicit multiple reflectance channel modeling , 2020, ACM Trans. Graph..

[19]  Rynson W. H. Lau,et al.  MODNet: Real-Time Trimap-Free Portrait Matting via Objective Decomposition , 2020, AAAI.

[20]  Debing Zhang,et al.  Partial FC: Training 10 Million Identities on a Single Machine , 2020, 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW).

[21]  Yun-Ta Tsai,et al.  Neural Light Transport for Relighting and View Synthesis , 2020, ACM Transactions on Graphics.

[22]  Yun-Ta Tsai,et al.  Portrait shadow manipulation , 2020, ACM Trans. Graph..

[23]  Zhengqin Li,et al.  Through the Looking Glass: Neural 3D Reconstruction of Transparent Shapes , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Hao Li,et al.  Deep face normalization , 2019, ACM Trans. Graph..

[25]  Paul E. Debevec,et al.  The relightables , 2019, ACM Trans. Graph..

[26]  George Drettakis,et al.  Multi-view relighting using a geometry-aware network , 2019, ACM Trans. Graph..

[27]  Andrew Zisserman,et al.  Sim2real transfer learning for 3D human pose estimation: motion to the rescue , 2019, NeurIPS.

[28]  Andreas M. Lehrmann,et al.  Learning Physics-Guided Face Relighting Under Directional Light , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Kalyan Sunkavalli,et al.  Inverse Rendering for Complex Indoor Scenes: Shape, Spatially-Varying Lighting and SVBRDF From a Single Image , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Yun-Ta Tsai,et al.  Single image portrait relighting , 2019, ACM Trans. Graph..

[31]  Timo Aila,et al.  A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Yoshihiro Kanamori,et al.  Relighting humans , 2018, ACM Trans. Graph..

[33]  Brecht Van Lommel,et al.  Arnold: A Brute-Force Production Path Tracer , 2018, ACM Trans. Graph..

[34]  Jianfei Cai,et al.  T2Net: Synthetic-to-Realistic Translation for Solving Single-Image Depth Estimation Tasks , 2018, ECCV.

[35]  Gang Yu,et al.  BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation , 2018, ECCV.

[36]  Jason M. Saragih,et al.  Deep appearance models for face rendering , 2018, ACM Trans. Graph..

[37]  Jan Kautz,et al.  Domain Stylization: A Strong, Simple Baseline for Synthetic to Real Image Domain Adaptation , 2018, ArXiv.

[38]  Toby P. Breckon,et al.  Real-Time Monocular Depth Estimation Using Synthetic Data with Domain Adaptation via Image Style Transfer , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[39]  Alexei A. Efros,et al.  The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[40]  Carlos D. Castillo,et al.  SfSNet: Learning Shape, Reflectance and Illuminance of Faces 'in the Wild' , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[41]  Taesung Park,et al.  CyCADA: Cycle-Consistent Adversarial Domain Adaptation , 2017, ICML.

[42]  Sylvain Paris,et al.  Portrait lighting transfer using a mass transport approach , 2017, TOGS.

[43]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[44]  Thomas Brox,et al.  FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Vladlen Koltun,et al.  Playing for Data: Ground Truth from Computer Games , 2016, ECCV.

[46]  Jitendra Malik,et al.  Shape, Illumination, and Reflectance from Shading , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[47]  Jernej Barbic,et al.  Skin microstructure deformation with displacement map convolution , 2015, ACM Trans. Graph..

[48]  Frédo Durand,et al.  Style transfer for headshot portraits , 2014, ACM Trans. Graph..

[49]  Paul Graham,et al.  Measurement‐Based Synthesis of Facial Microgeometry , 2012, SIGGRAPH '12.

[50]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[51]  Thabo Beeler,et al.  High-quality single-shot capture of facial geometry , 2010, ACM Trans. Graph..

[52]  Andrew Gardner,et al.  Performance relighting and reflectance transformation with time-multiplexed illumination , 2005, ACM Trans. Graph..

[53]  Greg Humphreys,et al.  Physically Based Rendering: From Theory to Implementation , 2004 .

[54]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[55]  Thomas V. Thompson,et al.  XGen: arbitrary primitive generator , 2003, SIGGRAPH '03.

[56]  Pat Hanrahan,et al.  An efficient representation for irradiance environment maps , 2001, SIGGRAPH.

[57]  Steve Marschner,et al.  A practical model for subsurface light transport , 2001, SIGGRAPH.

[58]  Paul E. Debevec,et al.  Acquiring the reflectance field of a human face , 2000, SIGGRAPH.

[59]  P. Debevec,et al.  Total Relighting: Learning to Relight Portraits for Background Replacement , 2021 .

[60]  Nitesh B. Gundavarapu,et al.  Supplementary Material OpenRooms: An Open Framework for Photorealistic Indoor Scene Datasets , 2021 .

[61]  Andrew Zisserman,et al.  Deep Face Recognition , 2015, BMVC.

[62]  Paul Debevec,et al.  The Light Stages and Their Applications to Photoreal Digital Actors , 2012, SIGGRAPH 2012.

[63]  M. Gross,et al.  Analysis of human faces using a measurement-based skin reflectance model , 2006, ACM Trans. Graph..