Deep Portrait Lighting Enhancement with 3D Guidance

Despite recent breakthroughs in deep learning methods for image lighting enhancement, they are inferior when applied to portraits because 3D facial information is ignored in their models. To address this, we present a novel deep learning framework for portrait lighting enhancement based on 3D facial guidance. Our framework consists of two stages. In the first stage, corrected lighting parameters are predicted by a network from the input bad lighting image, with the assistance of a 3D morphable model and a differentiable renderer. Given the predicted lighting parameter, the differentiable renderer renders a face image with corrected shading and texture, which serves as the 3D guidance for learning image lighting enhancement in the second stage. To better exploit the long‐range correlations between the input and the guidance, in the second stage, we design an image‐to‐image translation network with a novel transformer architecture, which automatically produces a lighting‐enhanced result. Experimental results on the FFHQ dataset and in‐the‐wild images show that the proposed method outperforms state‐of‐the‐art methods in terms of both quantitative metrics and visual quality.

[1]  Michael J. Black,et al.  OpenDR: An Approximate Differentiable Renderer , 2014, ECCV.

[2]  Xiaowei Zhou,et al.  Learning to Estimate 3D Human Pose and Shape from a Single Color Image , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[3]  Carlos D. Castillo,et al.  SfSNet: Learning Shape, Reflectance and Illuminance of Faces 'in the Wild' , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[4]  Michel Sarkis,et al.  Towards High Fidelity Face Relighting with Realistic Shadows , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Jiaolong Yang,et al.  Accurate 3D Face Reconstruction With Weakly-Supervised Learning: From Single Image to Image Set , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[6]  Sami Romdhani,et al.  A 3D Face Model for Pose and Illumination Invariant Face Recognition , 2009, 2009 Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance.

[7]  Arun Mallya,et al.  World-Consistent Video-to-Video Synthesis , 2020, ECCV.

[8]  Qinjie Xiao,et al.  GAN-Based Facial Attractiveness Enhancement , 2020, ArXiv.

[9]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Lei Zhang,et al.  Learning a Deep Single Image Contrast Enhancer from Multi-Exposure Images , 2018, IEEE Transactions on Image Processing.

[11]  Jan Kautz,et al.  High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[12]  Jian Sun,et al.  Automatic Exposure Correction of Consumer Photographs , 2012, ECCV.

[13]  Jitendra Malik,et al.  Multi-view Supervision for Single-View Reconstruction via Differentiable Ray Consistency , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Philippe Robert,et al.  Robust Point Light Source Estimation Using Differentiable Rendering , 2018, ArXiv.

[15]  Georg Heigold,et al.  An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2021, ICLR.

[16]  Ding Liu,et al.  EnlightenGAN: Deep Light Enhancement Without Paired Supervision , 2019, IEEE Transactions on Image Processing.

[17]  Jean-François Lalonde,et al.  Learning Physics-Guided Face Relighting Under Directional Light , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Konstantin Sofiiuk,et al.  Foreground-aware Semantic Representations for Image Harmonization , 2020, 2021 IEEE Winter Conference on Applications of Computer Vision (WACV).

[19]  Tatsuya Harada,et al.  Neural 3D Mesh Renderer , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[20]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[21]  Chen Wei,et al.  Deep Retinex Decomposition for Low-Light Enhancement , 2018, BMVC.

[22]  Takeo Kanade,et al.  Multi-PIE , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[23]  Hans-Peter Seidel,et al.  A Versatile Scene Model with Differentiable Visibility Applied to Generative Pose Estimation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[24]  Adam Finkelstein,et al.  Perspective-aware manipulation of portrait photos , 2016, ACM Trans. Graph..

[25]  Hao Li,et al.  Soft Rasterizer: A Differentiable Renderer for Image-Based 3D Reasoning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[26]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[27]  David E. Jacobs,et al.  Portrait shadow manipulation , 2020, ACM Trans. Graph..

[28]  Miika Aittala,et al.  A Dataset of Multi-Illumination Images in the Wild , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[29]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[30]  Yung-Yu Chuang,et al.  Deep Photo Enhancer: Unpaired Learning for Image Enhancement from Photographs with GANs , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[31]  Paul Debevec,et al.  Deep reflectance fields , 2019, ACM Trans. Graph..

[32]  Taesung Park,et al.  Semantic Image Synthesis With Spatially-Adaptive Normalization , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Xiaoou Tang,et al.  Aesthetic-Driven Image Enhancement by Adversarial Learning , 2017, ACM Multimedia.

[34]  Timo Aila,et al.  A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Jaakko Lehtinen,et al.  Learning to Predict 3D Objects with an Interpolation-based Differentiable Renderer , 2019, NeurIPS.

[36]  Alain Trémeau,et al.  Residual Conv-Deconv Grid Network for Semantic Segmentation , 2017, BMVC.

[37]  Hao He,et al.  Exposure , 2017, ACM Trans. Graph..

[38]  Honglak Lee,et al.  Perspective Transformer Nets: Learning Single-View 3D Object Reconstruction without 3D Supervision , 2016, NIPS.

[39]  Sam Kwong,et al.  Zero-Reference Deep Curve Estimation for Low-Light Image Enhancement , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  David W. Jacobs,et al.  Deep Single-Image Portrait Relighting , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[41]  Stephen Lin,et al.  Exposure , 2018, ACM Transactions on Graphics.

[42]  Nicolas Usunier,et al.  End-to-End Object Detection with Transformers , 2020, ECCV.

[43]  Yun-Ta Tsai,et al.  Single image portrait relighting , 2019, ACM Trans. Graph..

[44]  Jing Liao,et al.  High-Fidelity Pluralistic Image Completion with Transformers , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[45]  Ersin Yumer,et al.  Neural Face Editing with Intrinsic Image Disentangling , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Chi-Wing Fu,et al.  Underexposed Photo Enhancement Using Deep Illumination Estimation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Deli Zhao,et al.  DeepExposure: Learning to Expose Photos with Asynchronously Reinforced Adversarial Learning , 2018, NeurIPS.