Visual Speech Animation
暂无分享,去创建一个
Lei Xie | Lijuan Wang | Lei Xie | Shan Yang | Lijuan Wang | Lei Xie | Shan Yang
[1] Jörn Ostermann,et al. Lifelike talking faces for interactive services , 2003, Proc. IEEE.
[2] Michael M. Cohen,et al. Modeling Coarticulation in Synthetic Visual Speech , 1993 .
[3] Frank K. Soong,et al. A deep bidirectional LSTM approach for video-realistic talking head , 2016, Multimedia Tools and Applications.
[4] Ricardo Gutierrez-Osuna,et al. Audio/visual mapping with cross-modal hidden Markov models , 2005, IEEE Transactions on Multimedia.
[5] Justus Thies,et al. Face2Face: Real-Time Face Capture and Reenactment of RGB Videos , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[6] Jörn Ostermann,et al. Talking faces - technologies and applications , 2004, ICPR 2004.
[7] Keiichi Tokuda,et al. HMM-based text-to-audio-visual speech synthesis , 2000, INTERSPEECH.
[8] Gérard Bailly,et al. Animating Virtual Speakers or Singers from Audio: Lip-Synching Facial Animation , 2009, EURASIP J. Audio Speech Music. Process..
[9] D. Massaro. Perceiving talking faces: from speech perception to a behavioral principle , 1999 .
[10] Christoph Bregler,et al. Video Rewrite: Driving Visual Speech with Audio , 1997, SIGGRAPH.
[11] Matti Pietikäinen,et al. Facial 3D Shape Estimation from Images for Visual Speech Animation , 2014, 2014 22nd International Conference on Pattern Recognition.
[12] Hans Peter Graf,et al. Photo-Realistic Talking-Heads from Image Samples , 2000, IEEE Trans. Multim..
[13] Frederick I. Parke,et al. Computer generated animation of faces , 1972, ACM Annual Conference.
[14] Björn Stenger,et al. Expressive Visual Text-to-Speech Using Active Appearance Models , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.
[15] D. Massaro. Speech Perception By Ear and Eye: A Paradigm for Psychological Inquiry , 1989 .
[16] Dong Yu,et al. Deep Learning: Methods and Applications , 2014, Found. Trends Signal Process..
[17] Gregor Hofer,et al. HMM-based automatic eye-blink synthesis from speech , 2009, INTERSPEECH.
[18] Frédéric H. Pighin,et al. Expressive speech-driven facial animation , 2005, TOGS.
[19] Gang Chen,et al. Computer-Assisted Audiovisual Language Learning , 2012, Computer.
[20] Lei Xie,et al. Head motion synthesis from speech using deep neural networks , 2015, Multimedia Tools and Applications.
[21] Frank K. Soong,et al. Synthesizing photo-real talking head via trajectory-guided sample selection , 2010, INTERSPEECH.
[22] Tony Ezzat,et al. Trainable videorealistic speech animation , 2002, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..
[23] Maxine Eskénazi,et al. An overview of spoken language technology for education , 2009, Speech Commun..
[24] Matthew R. Scott,et al. Towards a Specialized Search Engine for Language Learners [Point of View] , 2011 .
[25] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.
[26] Zhi-Jie Yan,et al. An HMM trajectory tiling (HTT) approach to high quality TTS , 2010, INTERSPEECH.
[27] Atef Ben Youssef,et al. Articulatory features for speech-driven head motion synthesis , 2013, INTERSPEECH.
[28] Jun Du,et al. Robust speech recognition with speech enhanced deep neural networks , 2014, INTERSPEECH.
[29] Phil Hoole,et al. Announcing the Electromagnetic Articulography (Day 1) Subset of the mngu0 Articulatory Corpus , 2011, INTERSPEECH.
[30] Sascha Fagel,et al. An articulation model for audiovisual speech synthesis - Determination, adjustment, evaluation , 2004, Speech Commun..
[31] Moshe Mahler,et al. Dynamic units of visual speech , 2012, SCA '12.
[32] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.
[33] B. Seidlhofer. Common ground and different realities: world Englishes and English as a lingua franca , 2009 .
[34] Zhigang Deng,et al. Live Speech Driven Head-and-Eye Motion Generators , 2012, IEEE Transactions on Visualization and Computer Graphics.
[35] Justus Thies,et al. Demo of Face2Face: real-time face capture and reenactment of RGB videos , 2016, SIGGRAPH Emerging Technologies.
[36] Alan W. Black,et al. Unit selection in a concatenative speech synthesis system using a large speech database , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.
[37] Keiichi Tokuda,et al. Speech parameter generation algorithms for HMM-based speech synthesis , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).
[38] Hui Chen,et al. Phoneme-level articulatory animation in pronunciation training , 2012, Speech Commun..
[39] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[40] H. McGurk,et al. Hearing lips and seeing voices , 1976, Nature.
[41] Gwenn Englebienne,et al. A probabilistic model for generating realistic lip movements from speech , 2007, NIPS.
[42] Kuldip K. Paliwal,et al. Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..
[43] Bo Zhang,et al. A New Phonetic Candidate Generator for Improving Search Query Efficiency , 2011, INTERSPEECH.
[44] Anna Hjalmarsson,et al. Embodied conversational agents in computer assisted language learning , 2009, Speech Commun..
[45] Gérard Bailly,et al. LIPS2008: visual speech synthesis challenge , 2008, INTERSPEECH.
[46] Frank K. Soong,et al. High quality lips animation with speech and captured facial action unit as A/V input , 2012, Proceedings of The 2012 Asia Pacific Signal and Information Processing Association Annual Summit and Conference.
[47] Tony Ezzat,et al. Visual Speech Synthesis by Morphing Visemes , 2000, International Journal of Computer Vision.
[48] Zhigang Deng,et al. Data-Driven 3D Facial Animation , 2007 .
[49] Paul Taylor,et al. Text-to-Speech Synthesis , 2009 .
[50] Gérard Bailly,et al. Visual articulatory feedback for phonetic correction in second language learning , 2010 .
[51] Frank K. Soong,et al. HMM trajectory-guided sample selection for photo-realistic talking head , 2014, Multimedia Tools and Applications.
[52] W. H. Sumby,et al. Erratum: Visual Contribution to Speech Intelligibility in Noise [J. Acoust. Soc. Am. 26, 212 (1954)] , 1954 .
[53] Hao Li,et al. Realtime performance-based facial animation , 2011, ACM Trans. Graph..
[54] Algirdas Pakstas,et al. MPEG-4 Facial Animation: The Standard,Implementation and Applications , 2002 .
[55] Lei Xie,et al. Expressive talking avatar synthesis and animation , 2015, Multimedia Tools and Applications.
[56] Igor S. Pandzic,et al. MPEG-4 Facial Animation , 2002 .
[57] Keiichi Tokuda,et al. Text-to-visual speech synthesis based on parameter generation from HMM , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).
[58] Frank K. Soong,et al. Rendering a personalized photo-real talking head from short video footage , 2010, 2010 7th International Symposium on Chinese Spoken Language Processing.
[59] Lei Xie,et al. Realistic Mouth-Synching for Speech-Driven Talking Face Using Articulatory Modelling , 2007, IEEE Transactions on Multimedia.
[60] Hans Peter Graf,et al. Sample-based synthesis of photo-realistic talking heads , 1998, Proceedings Computer Animation '98 (Cat. No.98EX169).
[61] Zhigang Deng,et al. Rigid Head Motion in Expressive Speech Animation: Analysis and Synthesis , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[62] John P. Lewis,et al. Automated eye motion using texture synthesis , 2005, IEEE Computer Graphics and Applications.
[63] Lei Xie,et al. Articulatory movement prediction using deep bidirectional long short-term memory based recurrent neural networks and word/phone embeddings , 2015, INTERSPEECH.
[64] Lei Xie,et al. Photo-real talking head with deep bidirectional LSTM , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[65] Gérard Bailly,et al. Analyzing Gaze During Face-to-Face Interaction , 2007, IVA.
[66] Sepp Hochreiter,et al. The Vanishing Gradient Problem During Learning Recurrent Neural Nets and Problem Solutions , 1998, Int. J. Uncertain. Fuzziness Knowl. Based Syst..
[67] Timothy F. Cootes,et al. Active Appearance Models , 2001, IEEE Trans. Pattern Anal. Mach. Intell..
[68] Ronald J. Williams,et al. A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.
[69] Frank K. Soong,et al. High quality lip-sync animation for 3D photo-realistic talking head , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[70] Simon King,et al. Letter-to-Sound Pronunciation Prediction Using Conditional Random Fields , 2011, IEEE Signal Processing Letters.
[71] Sam T. Roweis,et al. EM Algorithms for PCA and SPCA , 1997, NIPS.
[72] Lianhong Cai,et al. Head and facial gestures synthesis using PAD model for an expressive talking avatar , 2014, Multimedia Tools and Applications.
[73] Frank K. Soong,et al. Synthesizing visual speech trajectory with minimum generation error , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[74] Yongxin Wang,et al. Emotional Audio-Visual Speech Synthesis Based on PAD , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[75] Lei Xie,et al. A coupled HMM approach to video-realistic speech animation , 2007, Pattern Recognit..
[76] Karen Kukich,et al. Techniques for automatically correcting words in text , 1992, CSUR.
[77] Zhi-Jie Yan,et al. RIch-context Unit Selection (RUS) approach to high quality TTS , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.
[78] R. Plomp,et al. Speechreading supplemented with formant-frequency information from voiced speech. , 1985, The Journal of the Acoustical Society of America.
[79] Heiga Zen,et al. Statistical parametric speech synthesis using deep neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[80] Jianwu Dang,et al. Visualization of Mandarin articulation by using a physiological articulatory model , 2013, 2013 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference.