StyleSync: High-Fidelity Generalized and Personalized Lip Sync in Style-Based Generator
暂无分享,去创建一个
Dongliang He | Errui Ding | Jingdong Wang | Jingtuo Liu | Hang Zhou | Ziwei Liu | Haocheng Feng | Kaisiyuan Wang | Jiazhi Guan | Tianshu Hu | Zhanwang Zhang
[1] Tangjie Lv,et al. DINet: Deformation Inpainting Network for Realistic Face Visually Dubbing on High Resolution Video , 2023, AAAI.
[2] Zhenhui Ye,et al. GeneFace: Generalized and High-Fidelity Audio-Driven 3D Talking Face Synthesis , 2023, ICLR.
[3] Xin Yu,et al. StyleTalk: One-shot Talking Head Generation with Controllable Speaking Styles , 2023, ArXiv.
[4] Daniel Cohen-Or,et al. Pivotal Tuning for Latent-based Editing of Real Images , 2021, ACM Trans. Graph..
[5] Errui Ding,et al. Masked Lip-Sync Prediction by Audio-Visual Contextual Exploitation in Transformers , 2022, SIGGRAPH Asia.
[6] Gang Zeng,et al. Real-time Neural Radiance Talking Portrait Synthesis via Audio-spatial Decomposition , 2022, ArXiv.
[7] Ming-Yu Liu,et al. SPACE: Speech-driven Portrait Animation with Controllable Expression , 2022, 2023 IEEE/CVF International Conference on Computer Vision (ICCV).
[8] Errui Ding,et al. StyleSwap: Style-Based Generator Empowers Robust Face Swapping , 2022, ECCV.
[9] Jiwen Lu,et al. Learning Dynamic Facial Radiance Fields for Few-Shot Talking Head Synthesis , 2022, ECCV.
[10] Se Jin Park,et al. SyncTalkFace: Talking Face Generation with Precise Lip-Syncing via Audio-Lip Memory , 2022, AAAI.
[11] Xiaoguang Han,et al. Expressive Talking Head Generation with Granular Audio-Visual Control , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[12] Wayne Wu,et al. EAMM: One-Shot Emotional Talking Face via Audio-Based Emotion-Aware Motion Model , 2022, SIGGRAPH.
[13] Zili Yi,et al. Region-Aware Face Swapping , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[14] Yujiu Yang,et al. StyleHEAT: One-Shot High-Resolution Editable Talking Face Generation via Pre-trained StyleGAN , 2022, ECCV.
[15] Bolei Zhou,et al. Semantic-Aware Implicit Neural Audio-Driven Video Portrait Generation , 2022, ECCV.
[16] T. Komura,et al. FaceFormer: Speech-Driven 3D Facial Animation with Transformers , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[17] Xin Yu,et al. One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning , 2021, AAAI.
[18] Amit H. Bermano,et al. HyperStyle: StyleGAN Inversion with HyperNetworks for Real Image Editing , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[19] Xiaoou Tang,et al. InterFaceGAN: Interpreting the Disentangled Face Representation Learned by GANs , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[20] Chen Change Loy,et al. Everybody’s Talkin’: Let Me Talk as You Want , 2020, IEEE Transactions on Information Forensics and Security.
[21] Haozhe Wu,et al. Imitating Arbitrary Talking Style for Realistic Audio-Driven Talking Face Synthesis , 2021, ACM Multimedia.
[22] Jinxiang Chai,et al. Live speech portraits , 2021, ACM Trans. Graph..
[23] Madhukar Budagavi,et al. FACIAL: Synthesizing Dynamic Talking Face with Implicit Attribute Learning , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[24] Hideki Koike,et al. Speech2Talking-Face: Inferring and Driving a Face with Synchronized Audio-Visual Representation , 2021, IJCAI.
[25] Changjie Fan,et al. Audio2Head: Audio-driven One-shot Talking-head Generation with Natural Head Motion , 2021, IJCAI.
[26] Jaakko Lehtinen,et al. Alias-Free Generative Adversarial Networks , 2021, NeurIPS.
[27] Vivek Kwatra,et al. LipSync3D: Data-Efficient Learning of Personalized 3D Talking Faces from Video using Pose and Lighting Normalization , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[28] Peiran Ren,et al. GAN Prior Embedded Network for Blind Face Restoration in the Wild , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[29] Chen Change Loy,et al. Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[30] Yaser Sheikh,et al. MeshTalk: 3D Face Animation from Speech using Cross-Modality Disentanglement , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[31] Xun Cao,et al. Audio-Driven Emotional Video Portraits , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[32] Daniel Cohen-Or,et al. ReStyle: A Residual-Based StyleGAN Encoder via Iterative Refinement , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[33] H. Bao,et al. AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[34] Daniel Cohen-Or,et al. Designing an encoder for StyleGAN image manipulation , 2021, ACM Trans. Graph..
[35] Xintao Wang,et al. Towards Real-World Blind Face Restoration with Generative Facial Prior , 2021, Computer Vision and Pattern Recognition.
[36] Daniel Cohen-Or,et al. Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[37] Bolei Zhou,et al. Closed-Form Factorization of Latent Semantics in GANs , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[38] C. V. Jawahar,et al. A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild , 2020, ACM Multimedia.
[39] Chenliang Xu,et al. Talking-head Generation with Rhythmic Head Motion , 2020, ECCV.
[40] Haitian Zheng,et al. What comprises a good talking-head video generation?: A Survey and Benchmark , 2020, ArXiv.
[41] Yang Zhou,et al. MakeltTalk , 2020, ACM Trans. Graph..
[42] Victor Lempitsky,et al. Neural Head Reenactment with Latent Pose Descriptors , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[43] Pratul P. Srinivasan,et al. NeRF , 2020, ECCV.
[44] Hujun Bao,et al. Audio-driven Talking Face Video Generation with Learning-based Personalized Head Pose , 2020, 2002.10137.
[45] Justus Thies,et al. Neural Voice Puppetry: Audio-driven Facial Reenactment , 2019, ECCV.
[46] Tero Karras,et al. Analyzing and Improving the Image Quality of StyleGAN , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[47] Peter Wonka,et al. Image2StyleGAN++: How to Edit the Embedded Images? , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[48] Yu Qiao,et al. MEAD: A Large-Scale Audio-Visual Dataset for Emotional Talking-Face Generation , 2020, ECCV.
[49] Joon Son Chung,et al. You Said That?: Synthesising Talking Faces from Audio , 2019, International Journal of Computer Vision.
[50] V. Lempitsky,et al. Few-Shot Adversarial Learning of Realistic Neural Talking Head Models , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[51] Chenliang Xu,et al. Hierarchical Cross-Modal Talking Face Generation With Dynamic Pixel-Wise Loss , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[52] Peter Wonka,et al. Image2StyleGAN: How to Embed Images Into the StyleGAN Latent Space? , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[53] Jiaolong Yang,et al. Accurate 3D Face Reconstruction With Weakly-Supervised Learning: From Single Image to Image Set , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[54] Timo Aila,et al. A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[55] Hang Zhou,et al. Talking Face Generation by Adversarially Disentangled Audio-Visual Representation , 2018, AAAI.
[56] Jingwen Zhu,et al. Talking Face Generation by Conditional Recurrent Adversarial Network , 2018, IJCAI.
[57] Joon Son Chung,et al. VoxCeleb2: Deep Speaker Recognition , 2018, INTERSPEECH.
[58] Subhransu Maji,et al. Visemenet , 2018, ACM Trans. Graph..
[59] Chenliang Xu,et al. Lip Movements Generation at a Glance , 2018, ECCV.
[60] Jan Kautz,et al. High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[61] Jaakko Lehtinen,et al. Audio-driven facial animation by joint end-to-end learning of pose and emotion , 2017, ACM Trans. Graph..
[62] Joon Son Chung,et al. VoxCeleb: A Large-Scale Speaker Identification Dataset , 2017, INTERSPEECH.
[63] Joon Son Chung,et al. Out of Time: Automated Lip Sync in the Wild , 2016, ACCV Workshops.
[64] Joon Son Chung,et al. Lip Reading in the Wild , 2016, ACCV.
[65] Lei Xie,et al. Photo-real talking head with deep bidirectional LSTM , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[66] Eero P. Simoncelli,et al. Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.