Learning Speech-driven 3D Conversational Gestures from Video
暂无分享,去创建一个
Hans-Peter Seidel | Christian Theobalt | Lingjie Liu | Ikhsanul Habibie | Gerard Pons-Moll | Mohamed Elgharib | Mohamed A. Elgharib | Dushyant Mehta | Weipeng Xu | H. Seidel | C. Theobalt | Weipeng Xu | Gerard Pons-Moll | Dushyant Mehta | Lingjie Liu | I. Habibie
[1] Justine Cassell,et al. BEAT: the Behavior Expression Animation Toolkit , 2001, Life-like characters.
[2] Hai Xuan Pham,et al. End-to-end Learning for 3D Facial Animation from Speech , 2018, ICMI.
[3] S. Goldin-Meadow,et al. The role of gesture in communication and thinking , 1999, Trends in Cognitive Sciences.
[4] Yisong Yue,et al. A deep learning approach for generalized speech animation , 2017, ACM Trans. Graph..
[5] Ira Kemelmacher-Shlizerman,et al. Synthesizing Obama , 2017, ACM Trans. Graph..
[6] Kenta Takeuchi,et al. Creating a Gesture-Speech Dataset for Speech-Based Automatic Gesture Generation , 2017, HCI.
[7] James J. Little,et al. A Simple Yet Effective Baseline for 3d Human Pose Estimation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[8] Simon Lucey,et al. Deformable Model Fitting by Regularized Landmark Mean-Shift , 2010, International Journal of Computer Vision.
[9] Jonas Beskow,et al. Style‐Controllable Speech‐Driven Gesture Synthesis Using Normalising Flows , 2020, Comput. Graph. Forum.
[10] Carlos Busso,et al. Speech-driven Animation with Meaningful Behaviors , 2017, Speech Commun..
[11] Rachel McDonnell,et al. Investigating the use of recurrent motion modelling for speech gesture generation , 2018, IVA.
[12] Hans-Peter Seidel,et al. Annotated New Text Engine Animation Animation Lexicon Animation Gesture Profiles MR : . . . JL : . . . Gesture Generation Video Annotated Gesture Script , 2007 .
[13] Ira Kemelmacher-Shlizerman,et al. Audio to Body Dynamics , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[14] Jan-Michael Frahm,et al. Towards Fully Mobile 3D Face, Body, and Environment Capture Using Only Head-worn Cameras , 2018, IEEE Transactions on Visualization and Computer Graphics.
[15] Christian Theobalt,et al. Reconstruction of Personalized 3D Face Rigs from Monocular Video , 2016, ACM Trans. Graph..
[16] Youngwoo Yoon,et al. Robots Learn Social Skills: End-to-End Learning of Co-Speech Gesture Generation for Humanoid Robots , 2018, 2019 International Conference on Robotics and Automation (ICRA).
[17] Taku Komura,et al. A Deep Learning Framework for Character Motion Synthesis and Editing , 2016, ACM Trans. Graph..
[18] Elisabeth André,et al. The Persona Effect: How Substantial Is It? , 1998, BCS HCI.
[19] Pascal Fua,et al. XNect: Real-time Multi-Person 3D Motion Capture with a Single RGB Camera , 2020, SIGGRAPH 2020.
[20] Qiang Huo,et al. Video-audio driven real-time facial animation , 2015, ACM Trans. Graph..
[21] Mark Steedman,et al. Animated conversation: rule-based generation of facial expression, gesture & spoken intonation for multiple conversational agents , 1994, SIGGRAPH.
[22] Stefanos Zafeiriou,et al. Synthesising 3D Facial Motion from “In-the-Wild” Speech , 2019, 2020 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020).
[23] Yukiko I. Nakano,et al. Style Transfer for Co-Speech Gesture Animation: A Multi-Speaker Conditional-Mixture Approach , 2020, ECCV.
[24] Justine Cassell,et al. Embodied conversational interface agents , 2000, CACM.
[25] Sergey Levine,et al. Real-time prosody-driven synthesis of body language , 2009, ACM Trans. Graph..
[26] Sherman Wilcox,et al. Language and Gesture , 2017 .
[27] Christian Theobalt,et al. Monocular Real-Time Hand Shape and Motion Capture Using Multi-Modal Data , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[28] Michael Neff,et al. Multi-objective adversarial gesture generation , 2019, MIG.
[29] Erich Elsen,et al. Deep Speech: Scaling up end-to-end speech recognition , 2014, ArXiv.
[30] Naoshi Kaneko,et al. Analyzing Input and Output Representations for Speech-Driven Gesture Generation , 2019, IVA.
[31] Engin Erzin,et al. Multimodal analysis of speech prosody and upper body gestures using hidden semi-Markov models , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[32] Yuyu Xu,et al. Virtual character performance from speech , 2013, SCA '13.
[33] Hiroshi Shimodaira,et al. Bidirectional LSTM Networks Employing Stacked Bottleneck Features for Expressive Speech-Driven Head Motion Synthesis , 2016, IVA.
[34] J. Cassell,et al. Embodied conversational agents , 2000 .
[35] Youngwoo Yoon,et al. Speech gesture generation from the trimodal context of text, audio, and speaker identity , 2020, ACM Trans. Graph..
[36] Taku Komura,et al. Learning motion manifolds with convolutional autoencoders , 2015, SIGGRAPH Asia Technical Briefs.
[37] Wojciech Zaremba,et al. Improved Techniques for Training GANs , 2016, NIPS.
[38] Yang Liu,et al. Speech-Driven Animation Constrained by Appropriate Discourse Functions , 2014, ICMI.
[39] Manfred K. Warmuth,et al. THE CMU SPHINX-4 SPEECH RECOGNITION SYSTEM , 2001 .
[40] Stacy Marsella,et al. Predicting Co-verbal Gestures: A Deep and Temporal Modeling Approach , 2015, IVA.
[41] Yaser Sheikh,et al. OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[42] Stacy Marsella,et al. How to Train Your Avatar: A Data Driven Approach to Gesture Generation , 2011, IVA.
[43] Kazuhiko Sumi,et al. Speech-to-Gesture Generation: A Challenge in Deep Learning Approach with Bi-Directional LSTM , 2017, HAI.
[44] Gustav Eje Henter,et al. Gesticulator: A framework for semantically-aware speech-driven gesture generation , 2020, ICMI.
[45] Dario Pavllo,et al. 3D Human Pose Estimation in Video With Temporal Convolutions and Semi-Supervised Training , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[46] Jitendra Malik,et al. Learning Individual Styles of Conversational Gesture , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[47] Kazuhiko Sumi,et al. Evaluation of Speech-to-Gesture Generation Using Bi-Directional LSTM Network , 2018, IVA.
[48] Michael J. Black,et al. Learning a model of facial shape and expression from 4D scans , 2017, ACM Trans. Graph..
[49] Carlos Busso,et al. Generating Human-Like Behaviors Using Joint, Speech-Driven Models for Conversational Agents , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[50] S. Levine,et al. Gesture controllers , 2010, ACM Trans. Graph..
[51] A. Kendon. Gesture: Visible Action as Utterance , 2004 .
[52] Michael J. Black,et al. Capture, Learning, and Synthesis of 3D Speaking Styles , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[53] Jaakko Lehtinen,et al. Audio-driven facial animation by joint end-to-end learning of pose and emotion , 2017, ACM Trans. Graph..
[54] Stacy Marsella,et al. Gesture generation with low-dimensional embeddings , 2014, AAMAS.
[55] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[56] Yaser Sheikh,et al. Talking With Hands 16.2M: A Large-Scale Dataset of Synchronized Body-Finger Motion and Audio for Conversational Motion Analysis and Synthesis , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[57] Thomas Brox,et al. U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.
[58] Yang Liu,et al. Meaningful head movements driven by emotional synthetic speech , 2017, Speech Commun..