Talking Head from Speech Audio using a Pre-trained Image Generator
暂无分享,去创建一个
[1] Mohamed Elhoseiny,et al. StyleGAN-V: A Continuous Video Generator with the Price, Image Quality and Perks of StyleGAN2 , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[2] Daniel Cohen-Or,et al. Pivotal Tuning for Latent-based Editing of Real Images , 2021, ACM Trans. Graph..
[3] Chen Change Loy,et al. Everybody’s Talkin’: Let Me Talk as You Want , 2020, IEEE Transactions on Information Forensics and Security.
[4] Madhukar Budagavi,et al. FACIAL: Synthesizing Dynamic Talking Face with Implicit Attribute Learning , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[5] Changjie Fan,et al. Audio2Head: Audio-driven One-shot Talking-head Generation with Natural Head Motion , 2021, IJCAI.
[6] Christian Theobalt,et al. StyleVideoGAN: A Temporal Generative Model using a Pretrained StyleGAN , 2021, BMVC.
[7] Jaakko Lehtinen,et al. Alias-Free Generative Adversarial Networks , 2021, NeurIPS.
[8] Yu Ding,et al. Flow-guided One-shot Talking Face Generation with a High-resolution Audio-visual Dataset , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[9] Vivek Kwatra,et al. LipSync3D: Data-Efficient Learning of Personalized 3D Talking Faces from Video using Pose and Lighting Normalization , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[10] Dimitris N. Metaxas,et al. A Good Image Generator Is What You Need for High-Resolution Video Synthesis , 2021, ICLR.
[11] Moustafa Meshry,et al. Learned Spatial Representations for Few-shot Talking-Head Synthesis , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[12] Chen Change Loy,et al. Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[13] Xun Cao,et al. Audio-Driven Emotional Video Portraits , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[14] Daniel Cohen-Or,et al. Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[15] Dipanjan Das,et al. Speech-Driven Facial Animation Using Cascaded GANs for Learning of Motion and Texture , 2020, ECCV.
[16] C. V. Jawahar,et al. A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild , 2020, ACM Multimedia.
[17] Yang Zhou,et al. MakeltTalk , 2020, ACM Trans. Graph..
[18] Aaron Hertzmann,et al. GANSpace: Discovering Interpretable GAN Controls , 2020, NeurIPS.
[19] Justus Thies,et al. Neural Voice Puppetry: Audio-driven Facial Reenactment , 2019, ECCV.
[20] Tero Karras,et al. Analyzing and Improving the Image Quality of StyleGAN , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[21] Bolei Zhou,et al. Interpreting the Latent Space of GANs for Semantic Face Editing , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[22] Yu Qiao,et al. MEAD: A Large-Scale Audio-Visual Dataset for Emotional Talking-Face Generation , 2020, ECCV.
[23] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.
[24] Maja Pantic,et al. Realistic Speech-Driven Facial Animation with GANs , 2019, International Journal of Computer Vision.
[25] Chenliang Xu,et al. Hierarchical Cross-Modal Talking Face Generation With Dynamic Pixel-Wise Loss , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[26] Peter Wonka,et al. Image2StyleGAN: How to Embed Images Into the StyleGAN Latent Space? , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[27] Andreas Rössler,et al. FaceForensics++: Learning to Detect Manipulated Facial Images , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[28] Timo Aila,et al. A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[29] Hang Zhou,et al. Talking Face Generation by Adversarially Disentangled Audio-Visual Representation , 2018, AAAI.
[30] M. Nießner,et al. ForensicTransfer: Weakly-supervised Domain Adaptation for Forgery Detection , 2018, ArXiv.
[31] Maja Pantic,et al. End-to-End Speech-Driven Facial Animation with Temporal GANs , 2018, BMVC.
[32] Chenliang Xu,et al. Lip Movements Generation at a Glance , 2018, ECCV.
[33] Alexei A. Efros,et al. The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[34] Jaakko Lehtinen,et al. Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.
[35] Joon Son Chung,et al. You said that? , 2017, BMVC.
[36] Naomi Harte,et al. TCD-TIMIT: An Audio-Visual Corpus of Continuous Speech , 2015, IEEE Transactions on Multimedia.
[37] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[38] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[39] Eero P. Simoncelli,et al. Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.