CPNet: Exploiting CLIP-based Attention Condenser and Probability Map Guidance for High-fidelity Talking Face Generation
暂无分享,去创建一个
[1] Minglun Gong,et al. Dynamic Mixture of Counter Network for Location-Agnostic Crowd Counting , 2023, 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV).
[2] Jun Ling,et al. StableFace: Analyzing and Improving Motion Stability for Talking Face Generation , 2022, ArXiv.
[3] Hyoung-Kyu Song,et al. Talking Face Generation with Multilingual TTS , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[4] T. Popa,et al. CLIP-Mesh: Generating textured meshes from text using pretrained image-text models , 2022, SIGGRAPH Asia.
[5] Lei Xie,et al. AnyoneNet: Synchronized Speech and Talking Head Generation for Arbitrary Persons , 2021, IEEE Transactions on Multimedia.
[6] Jun Zhou,et al. STNet: Scale Tree Network With Multi-Level Auxiliator for Crowd Counting , 2020, IEEE Transactions on Multimedia.
[7] Antoni B. Chan,et al. Kernel-Based Density Map Generation for Dense Object Counting , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[8] Zejun Ma,et al. Towards Realistic Visual Dubbing with Heterogeneous Sources , 2021, ACM Multimedia.
[9] Engin Erzin,et al. Investigating Contributions of Speech and Facial Landmarks for Talking Head Generation , 2021, Interspeech.
[10] Daniel Cohen-Or,et al. StyleGAN-NADA , 2021, ACM Trans. Graph..
[11] Yu Ding,et al. Flow-guided One-shot Talking Face Generation with a High-resolution Audio-visual Dataset , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[12] Vivek Kwatra,et al. LipSync3D: Data-Efficient Learning of Personalized 3D Talking Faces from Video using Pose and Lighting Normalization , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[13] Xun Cao,et al. Audio-Driven Emotional Video Portraits , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[14] Daniel Cohen-Or,et al. StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[15] Ilya Sutskever,et al. Learning Transferable Visual Models From Natural Language Supervision , 2021, ICML.
[16] Ran He,et al. Talking Face Generation via Learning Semantic and Temporal Synchronous Landmarks , 2021, 2020 25th International Conference on Pattern Recognition (ICPR).
[17] Lingyun Yu,et al. Multimodal Inputs Driven Talking Face Generation With Spatial–Temporal Dependency , 2021, IEEE Transactions on Circuits and Systems for Video Technology.
[18] Yan Wang,et al. Speech Driven Talking Head Generation via Attentional Landmarks Based Representation , 2020, INTERSPEECH.
[19] C. V. Jawahar,et al. A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild , 2020, ACM Multimedia.
[20] Enhua Wu,et al. Squeeze-and-Excitation Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[21] V. Lempitsky,et al. Few-Shot Adversarial Learning of Realistic Neural Talking Head Models , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[22] Timo Aila,et al. A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[23] Sjoerd van Steenkiste,et al. Towards Accurate Generative Models of Video: A New Metric & Challenges , 2018, ArXiv.
[24] Yoshua Bengio,et al. ObamaNet: Photo-realistic lip-sync from text , 2017, ArXiv.
[25] Raymond Y. K. Lau,et al. Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).
[26] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[27] Li Fei-Fei,et al. Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.
[28] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[29] Djemel Ziou,et al. Image Quality Metrics: PSNR vs. SSIM , 2010, 2010 20th International Conference on Pattern Recognition.