论文信息 - HQ3DAvatar: High Quality Controllable 3D Head Avatar

HQ3DAvatar: High Quality Controllable 3D Head Avatar

Multi-view volumetric rendering techniques have recently shown great potential in modeling and synthesizing high-quality head avatars. A common approach to capture full head dynamic performances is to track the underlying geometry using a mesh-based template or 3D cube-based graphics primitives. While these model-based approaches achieve promising results, they often fail to learn complex geometric details such as the mouth interior, hair, and topological changes over time. This paper presents a novel approach to building highly photorealistic digital head avatars. Our method learns a canonical space via an implicit function parameterized by a neural network. It leverages multiresolution hash encoding in the learned feature space, allowing for high-quality, faster training and high-resolution rendering. At test time, our method is driven by a monocular RGB video. Here, an image encoder extracts face-specific features that also condition the learnable canonical space. This encourages deformation-dependent texture variations during training. We also propose a novel optical flow based loss that ensures correspondences in the learned canonical space, thus encouraging artifact-free and temporally consistent renderings. We show results on challenging facial expressions and show free-viewpoint renderings at interactive real-time rates for medium image resolutions. Our method outperforms all existing approaches, both visually and numerically. We will release our multiple-identity dataset to encourage further research. Our Project page is available at: https://vcai.mpi-inf.mpg.de/projects/HQ3DAvatar/

[1] Peter Wonka,et al. 3DAvatarGAN: Bridging Domains for Personalized Editable Avatars , 2023, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2] Yong Zhang,et al. High-fidelity Facial Avatar Reconstruction from Monocular Video with Generative Priors , 2022, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3] Chi-Keung Tang,et al. FLNeRF: 3D Facial Landmarks Estimation in Neural Radiance Fields , 2022, ArXiv.

[4] Juyong Zhang,et al. Reconstructing Personalized Semantic Facial NeRF Models from Monocular Video , 2022, ACM Trans. Graph..

[5] Paulo F. U. Gotardo,et al. MoRF: Morphable Radiance Fields for Multiview Neural Head Modeling , 2022, SIGGRAPH.

[6] Angjoo Kanazawa,et al. TAVA: Template-free Animatable Volumetric Actors , 2022, ECCV.

[7] Hongsheng Li,et al. CGOF++: Controllable 3D Face Synthesis With Conditional Generative Occupancy Fields , 2022, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8] Xin Tong,et al. GRAM-HD: 3D-Consistent Image Generation at High Resolution with Generative Radiance Manifolds , 2022, ArXiv.

[9] ShahRukh Athar. RigNeRF: Fully Controllable Neural 3D Portraits , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10] Aayush Bansal,et al. KeypointNeRF: Generalizing Image-based Volumetric Avatars using Relative Spatial Encoding of Keypoints , 2022, ECCV.

[11] Xiaokang Yang,et al. Facial Geometric Detail Recovery via Implicit Representation , 2022, IEEE International Conference on Automatic Face & Gesture Recognition.

[12] T. Müller,et al. Instant neural graphics primitives with a multiresolution hash encoding , 2022, ACM Trans. Graph..

[13] Jeong Joon Park,et al. StyleSDF: High-Resolution 3D-Consistent Image and Geometry Generation , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14] Xin Tong,et al. GRAM: Generative Radiance Manifolds for 3D-Aware Image Generation , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15] Shalini De Mello,et al. Efficient Geometry-aware 3D Generative Adversarial Networks , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16] Michael J. Black,et al. I M Avatar: Implicit Morphable Head Avatars from Videos , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17] Ligang Liu,et al. HeadNeRF: A Realtime NeRF-based Parametric Head Model , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18] Benjamin Recht,et al. Plenoxels: Radiance Fields without Neural Networks , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19] C. Rother,et al. Neural Head Avatars from Monocular RGB Videos , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20] Abhijeet Ghosh,et al. AvatarMe++: Facial Shape and BRDF Inference with Photorealistic Rendering-Aware GANs , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21] Christian Theobalt,et al. StyleNeRF: A Style-based 3D-Aware Generator for High-resolution Image Synthesis , 2021, ICLR.

[22] Tali Dekel,et al. Layered neural atlases for consistent video editing , 2021, ACM Trans. Graph..

[23] Francesc Moreno-Noguer,et al. H3D-Net: Few-Shot High-Fidelity 3D Head Reconstruction , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[24] J.-Y. Zhu,et al. Advances in Neural Rendering , 2021, SIGGRAPH Courses.

[25] Jonathan T. Barron,et al. HyperNeRF , 2021, ACM Trans. Graph..

[26] Michael Zollhöfer,et al. Pixel-aligned Volumetric Avatars , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27] Moustafa Meshry,et al. Learned Spatial Representations for Few-shot Talking-Head Synthesis , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[28] Jason M. Saragih,et al. Pixel Codec Avatars , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29] Ren Ng,et al. PlenOctrees for Real-time Rendering of Neural Radiance Fields , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[30] Richard A. Newcombe,et al. Neural 3D Video Synthesis from Multi-view Video , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31] Yaser Sheikh,et al. Mixture of volumetric primitives for efficient neural rendering , 2021, ACM Transactions on Graphics.

[32] Charles T. Loop,et al. Neural Geometric Level of Detail: Real-time Rendering with Implicit 3D Shapes , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33] Michael Zollhöfer,et al. Learning Compositional Radiance Fields of Dynamic Human Heads , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[34] Ira Kemelmacher-Shlizerman,et al. Real-Time High-Resolution Background Matting , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35] Justus Thies,et al. Dynamic Neural Radiance Fields for Monocular 4D Facial Avatar Reconstruction , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[36] Jiajun Wu,et al. pi-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[37] Arun Mallya,et al. One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[38] Zhengqi Li,et al. Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[39] Jonathan T. Barron,et al. Nerfies: Deformable Neural Radiance Fields , 2020, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[40] Christian Theobalt,et al. Learning Complete 3D Morphable Face Models from Images and Videos , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[41] Christian Theobalt,et al. PIE , 2020, ACM Trans. Graph..

[42] Pratul P. Srinivasan,et al. NeRF , 2020, ECCV.

[43] Kun Zhou,et al. Towards High-Fidelity 3D Face Reconstruction From In-the-Wild Images Using Graph Convolutional Networks , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[44] T. Vetter,et al. 3D Morphable Face Models—Past, Present, and Future , 2019, ACM Trans. Graph..

[45] Justus Thies,et al. Deferred Neural Rendering: Image Synthesis using Neural Textures , 2019 .

[46] Feng Liu,et al. Towards High-Fidelity Nonlinear 3D Face Morphable Model , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[47] Stefanos Zafeiriou,et al. GANFIT: Generative Adversarial Network Fitting for High Fidelity 3D Face Reconstruction , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[48] Ron Kimmel,et al. Synthesizing facial photometries and corresponding geometries using generative adversarial networks , 2019, ACM Transactions on Multimedia Computing, Communications, and Applications.

[49] Richard A. Newcombe,et al. DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[50] Hao Li,et al. paGAN: real-time avatars using dynamic textures , 2019, ACM Trans. Graph..

[51] Shigeo Morishima,et al. High-fidelity facial reflectance and geometry inference from an unconstrained image , 2018, ACM Trans. Graph..

[52] Yaser Sheikh,et al. Deep appearance models for face rendering , 2018, ACM Trans. Graph..

[53] Patrick Pérez,et al. Deep video portraits , 2018, ACM Trans. Graph..

[54] Patrick Pérez,et al. State of the Art on Monocular 3D Face Reconstruction, Tracking, and Applications , 2018, Comput. Graph. Forum.

[55] Alexei A. Efros,et al. The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[56] M. Zollhöfer,et al. Self-Supervised Multi-level Face Model Learning for Monocular Reconstruction at Over 250 Hz , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[57] Michael J. Black,et al. Learning a model of facial shape and expression from 4D scans , 2017, ACM Trans. Graph..

[58] Bernhard Egger,et al. Morphable Face Models - An Open Framework , 2017, 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018).

[59] Kun Zhou,et al. Real-time facial animation with image-based dynamic avatars , 2016, ACM Trans. Graph..

[60] Justus Thies,et al. Face2Face: Real-Time Face Capture and Reenactment of RGB Videos , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[61] Mark Pauly,et al. Dynamic 3D avatar creation from hand-held video input , 2015, ACM Trans. Graph..

[62] Yiying Tong,et al. FaceWarehouse: A 3D Facial Expression Database for Visual Computing , 2014, IEEE Transactions on Visualization and Computer Graphics.

[63] Eero P. Simoncelli,et al. Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[64] Andrew Zisserman,et al. Deep Face Recognition , 2015, BMVC.