Audio-driven facial animation by joint end-to-end learning of pose and emotion
暂无分享,去创建一个
Jaakko Lehtinen | Timo Aila | Samuli Laine | Tero Karras | Antti Herva | Tero Karras | Timo Aila | S. Laine | J. Lehtinen | Antti Herva
[1] Michael M. Cohen,et al. Modeling Coarticulation in Synthetic Visual Speech , 1993 .
[2] Frédéric H. Pighin,et al. Unsupervised learning for speech motion editing , 2003, SCA '03.
[3] Tony Ezzat,et al. Trainable videorealistic speech animation , 2002, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..
[4] Ben P. Milner,et al. Audio-to-Visual Speech Conversion Using Deep Neural Networks , 2016, INTERSPEECH.
[5] Li Zhang,et al. Dynamic, expressive speech animation from a single mesh , 2007, SCA '07.
[6] Demetri Terzopoulos,et al. Multilinear subspace analysis of image ensembles , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..
[7] John P. Lewis,et al. Expressive Facial Animation Synthesis by Learning Speech Coarticulation and Expression Spaces , 2006, IEEE Transactions on Visualization and Computer Graphics.
[8] John Salvatier,et al. Theano: A Python framework for fast computation of mathematical expressions , 2016, ArXiv.
[9] J. P. Lewis,et al. Automated lip-synch and speech synthesis for character animation , 1987, CHI '87.
[10] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[11] Christoph Bregler,et al. Facial expression space learning , 2002, 10th Pacific Conference on Computer Graphics and Applications, 2002. Proceedings..
[12] Salil Deena,et al. Speech-Driven Facial Animation Using a Shared Gaussian Process Latent Variable Model , 2009, ISVC.
[13] Matthew Brand,et al. Voice puppetry , 1999, SIGGRAPH.
[14] Eugene Fiume,et al. JALI , 2016, ACM Trans. Graph..
[15] Dominic W. Massaro,et al. Animated speech: research progress and applications , 2001, AVSP.
[16] Ken-ichi Anjyo,et al. Practice and Theory of Blendshape Facial Models , 2014, Eurographics.
[17] Frank K. Soong,et al. A deep bidirectional LSTM approach for video-realistic talking head , 2016, Multimedia Tools and Applications.
[18] John Tran,et al. cuDNN: Efficient Primitives for Deep Learning , 2014, ArXiv.
[19] C. G. Fisher,et al. Confusions among visually perceived consonants. , 1968, Journal of speech and hearing research.
[20] Jörn Ostermann,et al. Realistic facial expression synthesis for an image-based talking head , 2011, 2011 IEEE International Conference on Multimedia and Expo.
[21] Wesley Mattheyses,et al. Audiovisual speech synthesis: An overview of the state-of-the-art , 2015, Speech Commun..
[22] Salil Deena,et al. Visual Speech Synthesis Using a Variable-Order Switching Shared Gaussian Process Dynamical Model , 2013, IEEE Transactions on Multimedia.
[23] Frédéric H. Pighin,et al. Expressive speech-driven facial animation , 2005, TOGS.
[24] Joshua B. Tenenbaum,et al. Separating Style and Content with Bilinear Models , 2000, Neural Computation.
[25] Valery A. Petrushin,et al. How well can People and Computers Recognize Emotions in Speech , 1998 .
[26] Jian Sun,et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[27] Alfred Mertins,et al. Automatic speech recognition and speech variability: A review , 2007, Speech Commun..
[28] Michael Pucher,et al. Joint Audiovisual Hidden Semi-Markov Model-Based Speech Synthesis , 2014, IEEE Journal of Selected Topics in Signal Processing.
[29] Mario Malcangi,et al. Text-driven avatars based on artificial neural networks and fuzzy logic , 2022 .
[30] Korin Richmond,et al. Comparison of HMM and TMDN methods for lip synchronisation , 2010, INTERSPEECH.
[31] Fernando De la Torre,et al. Emphatic Visual Speech Synthesis , 2009, IEEE Transactions on Audio, Speech, and Language Processing.
[32] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[33] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[34] Nadia Magnenat-Thalmann,et al. Lip synchronization using linear predictive analysis , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).
[35] Zhigang Deng,et al. Audio-based head motion synthesis for Avatar-based telepresence systems , 2004, ETP '04.
[36] Giampiero Salvi,et al. Using HMMs and ANNs for mapping acoustic to visual speech , 1999 .
[37] Ahmed M. Elgammal,et al. Separating style and content on a nonlinear manifold , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..
[38] Moshe Mahler,et al. Dynamic units of visual speech , 2012, SCA '12.
[39] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[40] Jonas Beskow,et al. Picture my voice: Audio to visual speech synthesis using artificial neural networks , 1999, AVSP.
[41] Björn Stenger,et al. Expressive Visual Text-to-Speech Using Active Appearance Models , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.
[42] Frank K. Soong,et al. HMM trajectory-guided sample selection for photo-realistic talking head , 2014, Multimedia Tools and Applications.
[43] Jean-Luc Schwartz,et al. No, There Is No 150 ms Lead of Visual Speech on Auditory Speech, but a Range of Audiovisual Asynchronies Varying from Small Audio Lead to Large Audio Lag , 2014, PLoS Comput. Biol..
[44] John Lewis,et al. Automated lip-sync: Background and techniques , 1991, Comput. Animat. Virtual Worlds.
[45] Yuyu Xu,et al. Virtual character performance from speech , 2013, SCA '13.
[46] Jovan Popovic,et al. Deformation transfer for triangle meshes , 2004, ACM Trans. Graph..
[47] Lianhong Cai,et al. Head and facial gestures synthesis using PAD model for an expressive talking avatar , 2014, Multimedia Tools and Applications.
[48] Thomas S. Huang,et al. Real-time speech-driven face animation with expressions using neural networks , 2002, IEEE Trans. Neural Networks.