DAE-Talker: High Fidelity Speech-Driven Talking Face Generation with Diffusion Autoencoder
暂无分享,去创建一个
K. Yu | Qi Chen | Sheng Zhao | Tianyu He | Xuejiao Tan | Jiang Bian | Xie Chen | Chenpng Du
[1] Eunho Yang,et al. Diffusion Video Autoencoders: Toward Temporally Consistent Face Video Editing via Disentangled Video Encoding , 2022, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[2] Jiashi Feng,et al. MagicVideo: Efficient Video Generation With Latent Diffusion Models , 2022, ArXiv.
[3] Ming-Yu Liu,et al. SPACE: Speech-driven Portrait Animation with Controllable Expression , 2022, 2023 IEEE/CVF International Conference on Computer Vision (ICCV).
[4] David J. Fleet,et al. Imagen Video: High Definition Video Generation with Diffusion Models , 2022, ArXiv.
[5] David C. Hogg,et al. Talking Head from Speech Audio using a Pre-trained Image Generator , 2022, ACM Multimedia.
[6] Jiwen Lu,et al. Learning Dynamic Facial Radiance Fields for Few-Shot Talking Head Synthesis , 2022, ECCV.
[7] Xie Chen,et al. VQTTS: High-Fidelity Text-to-Speech Synthesis with Self-Supervised VQ Acoustic Feature , 2022, INTERSPEECH.
[8] Shunyu Yao,et al. DFA-NeRF: Personalized Talking Head Generation via Disentangled Face Attributes Neural Rendering , 2022, ArXiv.
[9] T. Komura,et al. FaceFormer: Speech-Driven 3D Facial Animation with Transformers , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[10] Supasorn Suwajanakorn,et al. Diffusion Autoencoders: Toward a Meaningful and Decodable Representation , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[11] David J. Fleet,et al. Cascaded Diffusion Models for High Fidelity Image Generation , 2021, J. Mach. Learn. Res..
[12] Qifeng Chen,et al. Latent Video Diffusion Models for High-Fidelity Video Generation with Arbitrary Lengths , 2022, ArXiv.
[13] Vivek Kwatra,et al. LipSync3D: Data-Efficient Learning of Personalized 3D Talking Faces from Video using Pose and Lighting Normalization , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[14] Prafulla Dhariwal,et al. Diffusion Models Beat GANs on Image Synthesis , 2021, NeurIPS.
[15] Abhishek Kumar,et al. Score-Based Generative Modeling through Stochastic Differential Equations , 2020, ICLR.
[16] Jiaming Song,et al. Denoising Diffusion Implicit Models , 2020, ICLR.
[17] C. V. Jawahar,et al. A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild , 2020, ACM Multimedia.
[18] Abdel-rahman Mohamed,et al. wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations , 2020, NeurIPS.
[19] Pieter Abbeel,et al. Denoising Diffusion Probabilistic Models , 2020, NeurIPS.
[20] Yang Zhou,et al. MakeltTalk , 2020, ACM Trans. Graph..
[21] Hujun Bao,et al. Audio-driven Talking Face Video Generation with Learning-based Personalized Head Pose , 2020, 2002.10137.
[22] Justus Thies,et al. Neural Voice Puppetry: Audio-driven Facial Reenactment , 2019, ECCV.
[23] Michael Auli,et al. vq-wav2vec: Self-Supervised Learning of Discrete Speech Representations , 2019, ICLR.
[24] Maja Pantic,et al. Realistic Speech-Driven Facial Animation with GANs , 2019, International Journal of Computer Vision.
[25] Heiga Zen,et al. LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech , 2019, INTERSPEECH.
[26] Chenliang Xu,et al. Lip Movements Generation at a Glance , 2018, ECCV.
[27] Alexei A. Efros,et al. The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[28] Jaakko Lehtinen,et al. Audio-driven facial animation by joint end-to-end learning of pose and emotion , 2017, ACM Trans. Graph..
[29] Yisong Yue,et al. A deep learning approach for generalized speech animation , 2017, ACM Trans. Graph..
[30] Surya Ganguli,et al. Deep Unsupervised Learning using Nonequilibrium Thermodynamics , 2015, ICML.
[31] Eero P. Simoncelli,et al. Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.