Emotional Speech-Driven Animation with Content-Emotion Disentanglement
暂无分享,去创建一个
[1] Hongyan Liu,et al. EmoTalk: Speech-driven emotional disentanglement for 3D face animation , 2023, ArXiv.
[2] Menghan Xia,et al. CodeTalker: Speech-Driven 3D Facial Animation with Discrete Motion Prior , 2023, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[3] C. Theobalt,et al. Imitator: Personalized Speech-driven 3D Facial Animation , 2022, ArXiv.
[4] K. Cheng,et al. VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild , 2022, SIGGRAPH Asia.
[5] Clayton D. Scott,et al. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2022, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[6] Chen Change Loy,et al. CelebV-HQ: A Large-Scale Video Facial Attributes Dataset , 2022, ECCV.
[7] P. Maragos,et al. Visual Speech-Aware Perceptual 3D Facial Expression Reconstruction from Videos , 2022, ArXiv.
[8] Jason M. Saragih,et al. Multiface: A Dataset for Neural Face Rendering , 2022, ArXiv.
[9] Michael J. Black,et al. EMOCA: Emotion Driven Monocular Face Capture and Animation , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[10] Trevor Darrell,et al. Learning to Listen: Modeling Non-Deterministic Dyadic Facial Motion , 2022, Computer Vision and Pattern Recognition.
[11] Wojciech Zielonka,et al. Towards Metrical Reconstruction of Human Faces , 2022, ECCV.
[12] T. Komura,et al. FaceFormer: Speech-Driven 3D Facial Animation with Transformers , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[13] Foivos Paraperas Papantoniou,et al. Neural Emotion Director: Speech-preserving semantic control of facial expressions in “in-the-wild” videos , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[14] L. Gool,et al. GANmut: Learning Interpretable Conditional Space for Gamut of Emotions , 2021, Computer Vision and Pattern Recognition.
[15] Yaser Sheikh,et al. MeshTalk: 3D Face Animation from Speech using Cross-Modality Disentanglement , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[16] Xun Cao,et al. Audio-Driven Emotional Video Portraits , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[17] Michael J. Black,et al. Learning an animatable detailed 3D face model from in-the-wild images , 2020, ACM Trans. Graph..
[18] Esa Rahtu,et al. FACEGAN: Facial Attribute Controllable rEenactment GAN , 2020, 2021 IEEE Winter Conference on Applications of Computer Vision (WACV).
[19] Long Quan,et al. Self-Supervised Monocular 3D Face Reconstruction by Occlusion-Aware Multi-view Geometry Consistency , 2020, ECCV.
[20] Abdel-rahman Mohamed,et al. wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations , 2020, NeurIPS.
[21] Ruigang Yang,et al. FaceScape: A Large-Scale High Quality 3D Face Dataset and Detailed Riggable 3D Face Prediction , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[22] Jung-Woo Ha,et al. StarGAN v2: Diverse Image Synthesis for Multiple Domains , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[23] Wan-Yen Lo,et al. Accelerating 3D deep learning with PyTorch3D , 2019, SIGGRAPH Asia 2020 Courses.
[24] Hans-Peter Seidel,et al. Neural style-preserving visual dubbing , 2019, ACM Trans. Graph..
[25] T. Vetter,et al. 3D Morphable Face Models—Past, Present, and Future , 2019, ACM Trans. Graph..
[26] Robert B. Fisher,et al. 3D Visual passcode: Speech-driven 3D facial dynamics for behaviometrics , 2019, Signal Process..
[27] Fan Zhang,et al. MediaPipe: A Framework for Building Perception Pipelines , 2019, ArXiv.
[28] V. Lempitsky,et al. Few-Shot Adversarial Learning of Realistic Neural Talking Head Models , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[29] Michael J. Black,et al. Learning to Regress 3D Face Shape and Expression From an Image Without 3D Supervision , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[30] Chenliang Xu,et al. Hierarchical Cross-Modal Talking Face Generation With Dynamic Pixel-Wise Loss , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[31] Michael J. Black,et al. Capture, Learning, and Synthesis of 3D Speaking Styles , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[32] Stefan Wermter,et al. Facial Expression Editing with Continuous Emotion Labels , 2019, 2019 14th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2019).
[33] Esa Rahtu,et al. ICface: Interpretable and Controllable Face Reenactment Using GANs , 2019, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).
[34] Jiaolong Yang,et al. Accurate 3D Face Reconstruction With Weakly-Supervised Learning: From Single Image to Image Set , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[35] Hans-Peter Seidel,et al. FML: Face Model Learning From Videos , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[36] Rada Mihalcea,et al. MELD: A Multimodal Multi-Party Dataset for Emotion Recognition in Conversations , 2018, ACL.
[37] Joon Son Chung,et al. Deep Audio-Visual Speech Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[38] Joon Son Chung,et al. LRS3-TED: a large-scale dataset for visual speech recognition , 2018, ArXiv.
[39] Erik Cambria,et al. Multimodal Language Analysis in the Wild: CMU-MOSEI Dataset and Interpretable Dynamic Fusion Graph , 2018, ACL.
[40] Joon Son Chung,et al. VoxCeleb2: Deep Speaker Recognition , 2018, INTERSPEECH.
[41] William T. Freeman,et al. Unsupervised Training for 3D Morphable Model Regression , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[42] Patrick Pérez,et al. State of the Art on Monocular 3D Face Reconstruction, Tracking, and Applications , 2018, Comput. Graph. Forum.
[43] Andreas Rössler,et al. FaceForensics: A Large-scale Video Dataset for Forgery Detection in Human Faces , 2018, ArXiv.
[44] M. Zollhöfer,et al. Self-Supervised Multi-level Face Model Learning for Monocular Reconstruction at Over 250 Hz , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[45] Michael J. Black,et al. Learning a model of facial shape and expression from 4D scans , 2017, ACM Trans. Graph..
[46] Daniel Cohen-Or,et al. Bringing portraits to life , 2017, ACM Trans. Graph..
[47] Hai Xuan Pham,et al. End-to-end Learning for 3D Facial Animation from Raw Waveforms of Speech , 2017, ArXiv.
[48] Rama Chellappa,et al. ExprGAN: Facial Expression Editing with Controllable Expression Intensity , 2017, AAAI.
[49] Mohammad H. Mahoor,et al. AffectNet: A Database for Facial Expression, Valence, and Arousal Computing in the Wild , 2017, IEEE Transactions on Affective Computing.
[50] Jaakko Lehtinen,et al. Audio-driven facial animation by joint end-to-end learning of pose and emotion , 2017, ACM Trans. Graph..
[51] Joon Son Chung,et al. VoxCeleb: A Large-Scale Speaker Identification Dataset , 2017, INTERSPEECH.
[52] Patrick Pérez,et al. MoFA: Model-Based Deep Convolutional Face Autoencoder for Unsupervised Monocular Reconstruction , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[53] Georgios Tzimiropoulos,et al. How Far are We from Solving the 2D & 3D Face Alignment Problem? (and a Dataset of 230,000 3D Facial Landmarks) , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[54] Eugene Fiume,et al. JALI , 2016, ACM Trans. Graph..
[55] Louis-Philippe Morency,et al. MOSI: Multimodal Corpus of Sentiment Intensity and Subjectivity Analysis in Online Opinion Videos , 2016, ArXiv.
[56] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[57] S. R. Livingstone,et al. The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English , 2018, PloS one.
[58] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[59] Ragini Verma,et al. CREMA-D: Crowd-Sourced Emotional Multimodal Actors Dataset , 2014, IEEE Transactions on Affective Computing.
[60] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[61] Yuyu Xu,et al. A Practical and Configurable Lip Sync Method for Games , 2013, MIG.
[62] Moshe Mahler,et al. Dynamic units of visual speech , 2012, SCA '12.
[63] Luc Van Gool,et al. A 3-D Audio-Visual Corpus of Affective Communication , 2010, IEEE Transactions on Multimedia.
[64] Frédéric H. Pighin,et al. Expressive speech-driven facial animation , 2005, TOGS.
[65] Yu Qiao,et al. MEAD: A Large-Scale Audio-Visual Dataset for Emotional Talking-Face Generation , 2020, ECCV.
[66] Dominic W. Massaro,et al. Animated speech: research progress and applications , 2001, AVSP.