GeneFace++: Generalized and Stable Real-Time Audio-Driven 3D Talking Face Generation
暂无分享,去创建一个
Jia-Bin Huang | Zhenhui Ye | Rongjie Huang | Zhou Zhao | Jinglin Liu | Zejun Ma | Ziyue Jiang | Xiang Yin | Jinzheng He | Yixiang Ren
[1] Zhenhui Ye,et al. GeneFace: Generalized and High-Fidelity Audio-Driven 3D Talking Face Synthesis , 2023, ICLR.
[2] Anni Tang,et al. Memories are One-to-Many Mapping Alleviators in Talking Face Generation , 2022, IEEE transactions on pattern analysis and machine intelligence.
[3] Gang Zeng,et al. Real-time Neural Radiance Talking Portrait Synthesis via Audio-spatial Decomposition , 2022, ArXiv.
[4] Jiwen Lu,et al. Learning Dynamic Facial Radiance Fields for Few-Shot Talking Head Synthesis , 2022, ECCV.
[5] Xiaoguang Han,et al. Expressive Talking Head Generation with Granular Audio-Visual Control , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[6] Wayne Wu,et al. EAMM: One-Shot Emotional Talking Face via Audio-Based Emotion-Aware Motion Model , 2022, SIGGRAPH.
[7] Zhenhui Ye,et al. SyntaSpeech: Syntax-Aware Generative Adversarial Text-to-Speech , 2022, IJCAI.
[8] Yujiu Yang,et al. StyleHEAT: One-Shot High-Resolution Editable Talking Face Generation via Pre-trained StyleGAN , 2022, ECCV.
[9] Bolei Zhou,et al. Semantic-Aware Implicit Neural Audio-Driven Video Portrait Generation , 2022, ECCV.
[10] T. Müller,et al. Instant neural graphics primitives with a multiresolution hash encoding , 2022, ACM Trans. Graph..
[11] Shalini De Mello,et al. Efficient Geometry-aware 3D Generative Adversarial Networks , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[12] Ligang Liu,et al. HeadNeRF: A Realtime NeRF-based Parametric Head Model , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[13] Xun Cao,et al. MoFaNeRF: Morphable Facial Neural Radiance Field , 2021, ECCV.
[14] Xu Tan,et al. Transformer-S2A: Robust and Efficient Speech-to-Animation , 2021, ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[15] Haozhe Wu,et al. Imitating Arbitrary Talking Style for Realistic Audio-Driven Talking Face Synthesis , 2021, ACM Multimedia.
[16] Jinxiang Chai,et al. Live speech portraits , 2021, ACM Trans. Graph..
[17] Madhukar Budagavi,et al. FACIAL: Synthesizing Dynamic Talking Face with Implicit Attribute Learning , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[18] Hideki Koike,et al. Speech2Talking-Face: Inferring and Driving a Face with Synchronized Audio-Visual Representation , 2021, IJCAI.
[19] Jonathan T. Barron,et al. HyperNeRF , 2021, ACM Trans. Graph..
[20] Ruslan Salakhutdinov,et al. Hubert: How Much Can a Bad Teacher Benefit ASR Pre-Training? , 2021, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[21] H. Bao,et al. AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[22] Lingyun Yu,et al. Multimodal Inputs Driven Talking Face Generation With Spatial–Temporal Dependency , 2021, IEEE Transactions on Circuits and Systems for Video Technology.
[23] Justus Thies,et al. Dynamic Neural Radiance Fields for Monocular 4D Facial Avatar Reconstruction , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[24] Jiajun Wu,et al. pi-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[25] Francesc Moreno-Noguer,et al. D-NeRF: Neural Radiance Fields for Dynamic Scenes , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[26] Jonathan T. Barron,et al. Nerfies: Deformable Neural Radiance Fields , 2020, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[27] C. V. Jawahar,et al. A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild , 2020, ACM Multimedia.
[28] Lin Gao,et al. DeepFaceDrawing: deep generation of face images from sketches , 2020, ACM Trans. Graph..
[29] Haitian Zheng,et al. What comprises a good talking-head video generation?: A Survey and Benchmark , 2020, ArXiv.
[30] Yang Zhou,et al. MakeltTalk , 2020, ACM Trans. Graph..
[31] Pratul P. Srinivasan,et al. NeRF , 2020, ECCV.
[32] Hujun Bao,et al. Audio-driven Talking Face Video Generation with Learning-based Personalized Head Pose , 2020, 2002.10137.
[33] Justus Thies,et al. Neural Voice Puppetry: Audio-driven Facial Reenactment , 2019, ECCV.
[34] Lingyun Wu,et al. MaskGAN: Towards Diverse and Interactive Facial Image Manipulation , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[35] Joon Son Chung,et al. You Said That?: Synthesising Talking Faces from Audio , 2019, International Journal of Computer Vision.
[36] Chenliang Xu,et al. Hierarchical Cross-Modal Talking Face Generation With Dynamic Pixel-Wise Loss , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[37] Hang Zhou,et al. Talking Face Generation by Adversarially Disentangled Audio-Visual Representation , 2018, AAAI.
[38] Joon Son Chung,et al. LRS3-TED: a large-scale dataset for visual speech recognition , 2018, ArXiv.
[39] Chenliang Xu,et al. Lip Movements Generation at a Glance , 2018, ECCV.
[40] Jaakko Lehtinen,et al. Audio-driven facial animation by joint end-to-end learning of pose and emotion , 2017, ACM Trans. Graph..
[41] Sepp Hochreiter,et al. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.
[42] Aaron C. Courville,et al. Improved Training of Wasserstein GANs , 2017, NIPS.
[43] Alexei A. Efros,et al. Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[44] Erich Elsen,et al. Deep Speech: Scaling up end-to-end speech recognition , 2014, ArXiv.
[45] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[46] W. Thompson,et al. Facial expressions of singers influence perceived pitch relations , 2010, Psychonomic bulletin & review.
[47] Sami Romdhani,et al. A 3D Face Model for Pose and Illumination Invariant Face Recognition , 2009, 2009 Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance.
[48] S T Roweis,et al. Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.