Learning Continuous Facial Actions From Speech for Real-Time Animation
暂无分享,去创建一个
[1] Yan Tong,et al. Listen to Your Face: Inferring Facial Action Units from Audio Channel , 2017, IEEE Transactions on Affective Computing.
[2] Hai Xuan Pham,et al. End-to-end Learning for 3D Facial Animation from Speech , 2018, ICMI.
[3] S. R. Livingstone,et al. The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English , 2018, PloS one.
[4] Jaakko Lehtinen,et al. Audio-driven facial animation by joint end-to-end learning of pose and emotion , 2017, ACM Trans. Graph..
[5] Yisong Yue,et al. A deep learning approach for generalized speech animation , 2017, ACM Trans. Graph..
[6] Hai Xuan Pham,et al. Speech-Driven 3D Facial Animation with Implicit Emotional Awareness: A Deep Learning Approach , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[7] Joon Son Chung,et al. You said that? , 2017, BMVC.
[8] Vladimir Pavlovic,et al. Robust Real-Time 3 D Face Tracking from RGBD Videos under Extreme Pose , Depth , and Expression Variations , 2017 .
[9] Eduardo Coutinho,et al. The INTERSPEECH 2016 Computational Paralinguistics Challenge: Deception, Sincerity & Native Language , 2016, INTERSPEECH.
[10] Björn W. Schuller,et al. The Geneva Minimalistic Acoustic Parameter Set (GeMAPS) for Voice Research and Affective Computing , 2016, IEEE Transactions on Affective Computing.
[11] George Trigeorgis,et al. Adieu features? End-to-end speech emotion recognition using a deep convolutional recurrent network , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[12] Hai Xuan Pham,et al. Robust real-time performance-driven 3D face tracking , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).
[13] Min Chen,et al. Audiovisual Facial Action Unit Recognition using Feature Level Fusion , 2016, Int. J. Multim. Data Eng. Manag..
[14] Frank K. Soong,et al. A deep bidirectional LSTM approach for video-realistic talking head , 2016, Multimedia Tools and Applications.
[15] Tara N. Sainath,et al. Convolutional, Long Short-Term Memory, fully connected Deep Neural Networks , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[16] Ron J. Weiss,et al. Speech acoustic modeling from raw multichannel waveforms , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[17] Tara N. Sainath,et al. Deep Convolutional Neural Networks for Large-scale Speech Tasks , 2015, Neural Networks.
[18] Fabien Ringeval,et al. Face reading from speech - predicting facial action units from audio cues , 2015, INTERSPEECH.
[19] Tara N. Sainath,et al. Learning the speech front-end with raw waveform CLDNNs , 2015, INTERSPEECH.
[20] Lei Xie,et al. Head motion synthesis from speech using deep neural networks , 2015, Multimedia Tools and Applications.
[21] Yoshua Bengio,et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.
[22] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[23] Gerald Penn,et al. Convolutional Neural Networks for Speech Recognition , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[24] Frank K. Soong,et al. On the training aspects of Deep Neural Network (DNN) for parametric TTS synthesis , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[25] Yiying Tong,et al. FaceWarehouse: A 3D Facial Expression Database for Visual Computing , 2014, IEEE Transactions on Visualization and Computer Graphics.
[26] Björn W. Schuller,et al. Recent developments in openSMILE, the munich open-source multimedia feature extractor , 2013, ACM Multimedia.
[27] Frank K. Soong,et al. A new language independent, photo-realistic talking head driven by voice only , 2013, INTERSPEECH.
[28] Fabio Valente,et al. The INTERSPEECH 2013 computational paralinguistics challenge: social signals, conflict, emotion, autism , 2013, INTERSPEECH.
[29] Frédéric Bimbot,et al. Facial Expression Recognition from Speech , 2013 .
[30] Li Deng,et al. A deep convolutional neural network using heterogeneous pooling for trading acoustic invariance with phonetic confusion , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[31] Heiga Zen,et al. Statistical parametric speech synthesis using deep neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[32] Dimitri Palaz,et al. Estimating phoneme class conditional probabilities from raw speech signal using convolutional neural networks , 2013, INTERSPEECH.
[33] K. Scherer,et al. Introducing the Geneva Multimodal expression corpus for experimental research on emotion perception. , 2012, Emotion.
[34] Gerald Penn,et al. Applying Convolutional Neural Networks concepts to hybrid NN-HMM model for speech recognition , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[35] Frank K. Soong,et al. Text Driven 3D Photo-Realistic Talking Head , 2011, INTERSPEECH.
[36] Brian C. Lovell,et al. Multi-Region Probabilistic Histograms for Robust and Scalable Identity Inference , 2009, ICB.
[37] Björn Granström,et al. SynFace—Speech-Driven Facial Animation for Virtual Speech-Reading Support , 2009, EURASIP J. Audio Speech Music. Process..
[38] James D. Edge,et al. Audio-visual feature selection and reduction for emotion classification , 2008, AVSP.
[39] Alice Wang,et al. Assembling an expressive facial animation system , 2007, Sandbox '07.
[40] Lei Xie,et al. Realistic Mouth-Synching for Speech-Driven Talking Face Using Articulatory Modelling , 2007, IEEE Transactions on Multimedia.
[41] Lianhong Cai,et al. Real-time synthesis of Chinese visual speech and facial expressions using MPEG-4 FAP features in a three-dimensional avatar , 2006, INTERSPEECH.
[42] Frédéric H. Pighin,et al. Expressive speech-driven facial animation , 2005, TOGS.
[43] Jörn Ostermann,et al. Lifelike talking faces for interactive services , 2003, Proc. IEEE.
[44] Tomaso A. Poggio,et al. Reanimating Faces in Images and Video , 2003, Comput. Graph. Forum.
[45] Keiichi Tokuda,et al. HMM-based text-to-audio-visual speech synthesis , 2000, INTERSPEECH.
[46] Matthew Turk,et al. A Morphable Model For The Synthesis Of 3D Faces , 1999, SIGGRAPH.
[47] Yoshua Bengio,et al. Convolutional networks for images, speech, and time series , 1998 .
[48] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[49] Christoph Bregler,et al. Video Rewrite: Driving Visual Speech with Audio , 1997, SIGGRAPH.
[50] P. Ekman,et al. PERSONALITY PROCESSES AND INDIVIDUAL DIFFERENCES Universals and Cultural Differences in the Judgments of Facial Expressions of Emotion , 2004 .
[51] P. Ekman,et al. Constants across cultures in the face and emotion. , 1971, Journal of personality and social psychology.