Controllable 3D Face Synthesis with Conditional Generative Occupancy Fields

Capitalizing on the recent advances in image generation models, existing controllable face image synthesis methods are able to generate high-fidelity images with some levels of controllability, e.g., controlling the shapes, expressions, tex-tures, and poses of the generated face images. However, these methods focus on 2D image generative models, which are prone to producing inconsistent face images under large expression and pose changes. In this paper, we propose a new NeRF-based conditional 3D face synthesis framework, which enables 3D controllability over the generated face images by imposing explicit 3D conditions from 3D face priors. At its core is a conditional Generative Occupancy Field (cGOF) that effectively enforces the shape of the generated face to commit to a given 3D Morphable Model (3DMM) mesh. To achieve accurate control over fine-grained 3D face shapes of the synthesized image, we additionally incorporate a 3D landmark loss as well as a volume warping loss into our synthesis algorithm. Experiments validate the effectiveness of the proposed method, which is able to generate high-fidelity face images and shows more precise 3D controllability than state-of-the-art 2D-based controllable face synthesis methods. Find code and demo at https://keqiangsun.github.io/projects/cgof.

[1]  Jeong Joon Park,et al.  StyleSDF: High-Resolution 3D-Consistent Image and Geometry Generation , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Xin Tong,et al.  GRAM: Generative Radiance Manifolds for 3D-Aware Image Generation , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Shalini De Mello,et al.  Efficient Geometry-aware 3D Generative Adversarial Networks , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Ligang Liu,et al.  HeadNeRF: A Realtime NeRF-based Parametric Head Model , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Jonathan T. Barron,et al.  RegNeRF: Regularizing Neural Radiance Fields for View Synthesis from Sparse Inputs , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Christian Theobalt,et al.  StyleNeRF: A Style-based 3D-Aware Generator for High-resolution Image Synthesis , 2021, ICLR.

[7]  Jingyi Yu,et al.  SofGAN: A Portrait Image Generator with Dynamic Styling , 2020, ACM Trans. Graph..

[8]  Bo Dai,et al.  Generative Occupancy Fields for 3D Surface-Aware Image Synthesis , 2021, NeurIPS.

[9]  George Drettakis,et al.  FreeStyleGAN , 2021, ACM Trans. Graph..

[10]  Jaakko Lehtinen,et al.  Alias-Free Generative Adversarial Networks , 2021, NeurIPS.

[11]  Keqiang Sun,et al.  Inverting Generative Adversarial Renderer for Face Reconstruction , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Andreas Geiger,et al.  CAMPARI: Camera-Aware Decomposed Generative Neural Radiance Fields , 2021, 2021 International Conference on 3D Vision (3DV).

[13]  Daniel Cohen-Or,et al.  Designing an encoder for StyleGAN image manipulation , 2021, ACM Trans. Graph..

[14]  Xintao Wang,et al.  Towards Real-World Blind Face Restoration with Generative Facial Prior , 2021, Computer Vision and Pattern Recognition.

[15]  Alon Shoshan,et al.  GAN-Control: Explicitly Controllable GANs , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[16]  Jiajun Wu,et al.  pi-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Anil K. Jain,et al.  Lifting 2D StyleGAN for 3D-Aware Face Generation , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Andreas Geiger,et al.  GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Kai Zhang,et al.  NeRF++: Analyzing and Improving Neural Radiance Fields , 2020, ArXiv.

[20]  Christian Theobalt,et al.  PIE , 2020, ACM Trans. Graph..

[21]  Andreas Geiger,et al.  GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis , 2020, NeurIPS.

[22]  Gordon Wetzstein,et al.  Implicit Neural Representations with Periodic Activation Functions , 2020, NeurIPS.

[23]  Jiaolong Yang,et al.  Disentangled and Controllable Face Image Generation via 3D Imitative-Contrastive Learning , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Aaron Hertzmann,et al.  GANSpace: Discovering Interpretable GAN Controls , 2020, NeurIPS.

[25]  Christian Theobalt,et al.  StyleRig: Rigging StyleGAN for 3D Control Over Portrait Images , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Pratul P. Srinivasan,et al.  NeRF , 2020, ECCV.

[27]  Nate Kushman,et al.  Inverse Graphics GAN: Learning to Generate 3D Shapes from Unstructured 2D Data , 2020, ArXiv.

[28]  Yong-Liang Yang,et al.  BlockGAN: Learning 3D Object-aware Scene Representations from Unlabelled Images , 2020, NeurIPS.

[29]  Andreas Geiger,et al.  Towards Unsupervised Learning of Generative Models for 3D Controllable Image Synthesis , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Tero Karras,et al.  Analyzing and Improving the Image Quality of StyleGAN , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Peter Wonka,et al.  Image2StyleGAN++: How to Edit the Embedded Images? , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Sewoong Oh,et al.  InfoGAN-CR and ModelCentrality: Self-supervised Model Training and Selection for Disentangling GANs , 2019, ICML.

[33]  Zuochang Ye,et al.  FAB: A Robust Facial Landmark Detection Framework for Motion-Blurred Videos , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[34]  Paolo Favaro,et al.  Unsupervised Generative 3D Shape Learning from Natural Images , 2019, ArXiv.

[35]  Jiaya Jia,et al.  Aggregation via Separation: Boosting Facial Landmark Detector With Semi-Supervised Style Translation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[36]  Yong-Liang Yang,et al.  HoloGAN: Unsupervised Learning of 3D Representations From Natural Images , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[37]  Jiaolong Yang,et al.  Accurate 3D Face Reconstruction With Weakly-Supervised Learning: From Single Image to Image Set , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[38]  Timo Aila,et al.  A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Antitza Dantcheva,et al.  From Attribute-Labels to Faces: Face Generation Using a Conditional Generative Adversarial Network , 2018, ECCV Workshops.

[40]  Jitendra Malik,et al.  Learning Category-Specific Mesh Reconstruction from Image Collections , 2018, ECCV.

[41]  Sebastian Nowozin,et al.  Which Training Methods for GANs do actually Converge? , 2018, ICML.

[42]  Vishal M. Patel,et al.  Face Synthesis from Visual Attributes via Sketch using Conditional VAEs and GANs , 2017, ArXiv.

[43]  Chi-Keung Tang,et al.  Attribute-Guided Face Generation Using Conditional CycleGAN , 2017, ECCV.

[44]  Subhransu Maji,et al.  3D Shape Induction from 2D Views of Multiple Objects , 2016, 2017 International Conference on 3D Vision (3DV).

[45]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Nash Equilibrium , 2017, ArXiv.

[46]  Jiajun Wu,et al.  Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling , 2016, NIPS.

[47]  Pieter Abbeel,et al.  InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets , 2016, NIPS.

[48]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[49]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[50]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[51]  Stefanos Zafeiriou,et al.  300 Faces in-the-Wild Challenge: The First Facial Landmark Localization Challenge , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[52]  Sami Romdhani,et al.  A 3D Face Model for Pose and Illumination Invariant Face Recognition , 2009, 2009 Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance.

[53]  Matthew Turk,et al.  A Morphable Model For The Synthesis Of 3D Faces , 1999, SIGGRAPH.

[54]  Nelson L. Max,et al.  Optical Models for Direct Volume Rendering , 1995, IEEE Trans. Vis. Comput. Graph..