A statistical parametric approach to video-realistic text-driven talking avatar
暂无分享,去创建一个
Lei Xie | Bo Fan | Naicai Sun
[1] Christoph Bregler,et al. Video Rewrite: Driving Visual Speech with Audio , 1997, SIGGRAPH.
[2] Patrick Pérez,et al. Poisson image editing , 2003, ACM Trans. Graph..
[3] Mark Pauly,et al. Realtime performance-based facial animation , 2011, ACM Trans. Graph..
[4] Zhigang Deng,et al. Data-Driven 3D Facial Animation , 2007 .
[5] Barry-John Theobald,et al. Animating Faces Using Appearance Models , 2007 .
[6] Barry-John Theobald,et al. A real-time speech-driven talking head using active appearance models , 2007, AVSP.
[7] Juergen Luettin,et al. Audio-Visual Automatic Speech Recognition: An Overview , 2004 .
[8] Satoshi Nakamura,et al. Lip movement synthesis from speech based on hidden Markov models , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.
[9] Yongxin Wang,et al. Emotional Audio-Visual Speech Synthesis Based on PAD , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[10] Jörn Ostermann,et al. Talking faces - technologies and applications , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..
[11] Takao Kobayashi,et al. Text-to-audio-visual speech synthesis based on parameter generation from HMM , 1999, EUROSPEECH.
[12] Keiichi Tokuda,et al. Text-to-visual speech synthesis based on parameter generation from HMM , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).
[13] Lei Xie,et al. Realistic Mouth-Synching for Speech-Driven Talking Face Using Articulatory Modelling , 2007, IEEE Transactions on Multimedia.
[14] Tony Ezzat,et al. Visual Speech Synthesis by Morphing Visemes , 2000, International Journal of Computer Vision.
[15] Zhihong Zeng,et al. Audio–Visual Affective Expression Recognition Through Multistream Fused HMM , 2008, IEEE Transactions on Multimedia.
[16] H. McGurk,et al. Hearing lips and seeing voices , 1976, Nature.
[17] Jörn Ostermann,et al. Optimization of an Image-Based Talking Head System , 2009, EURASIP J. Audio Speech Music. Process..
[18] Jenq-Neng Hwang,et al. Baum-Welch hidden Markov model inversion for reliable audio-to-visual conversion , 1999, 1999 IEEE Third Workshop on Multimedia Signal Processing (Cat. No.99TH8451).
[19] Keiichi Tokuda,et al. A training method for average voice model based on shared decision tree context clustering and speaker adaptive training , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..
[20] Gérard Bailly,et al. Animating Virtual Speakers or Singers from Audio: Lip-Synching Facial Animation , 2009, EURASIP J. Audio Speech Music. Process..
[21] Algirdas Pakstas,et al. MPEG-4 Facial Animation: The Standard,Implementation and Applications , 2002 .
[22] Lianhong Cai,et al. Head and facial gestures synthesis using PAD model for an expressive talking avatar , 2014, Multimedia Tools and Applications.
[23] Björn Granström,et al. SynFace—Speech-Driven Facial Animation for Virtual Speech-Reading Support , 2009, EURASIP J. Audio Speech Music. Process..
[24] Tony Ezzat,et al. Trainable videorealistic speech animation , 2002, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..
[25] Lianhong Cai,et al. Synthesizing English emphatic speech for multimodal corrective feedback in computer-aided pronunciation training , 2013, Multimedia Tools and Applications.
[26] Lianhong Cai,et al. Real-time synthesis of Chinese visual speech and facial expressions using MPEG-4 FAP features in a three-dimensional avatar , 2006, INTERSPEECH.
[27] H. Shimodaira,et al. Carnival—Combining Speech Technology and Computer Animation , 2011, IEEE Computer Graphics and Applications.
[28] Keiichi Tokuda,et al. Speech parameter generation algorithms for HMM-based speech synthesis , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).
[29] Matthew Brand,et al. Voice puppetry , 1999, SIGGRAPH.
[30] Ricardo Gutierrez-Osuna,et al. Audio/visual mapping with cross-modal hidden Markov models , 2005, IEEE Transactions on Multimedia.
[31] Tsuhan Chen,et al. Audiovisual speech processing , 2001, IEEE Signal Process. Mag..
[32] Frank K. Soong,et al. Text Driven 3D Photo-Realistic Talking Head , 2011, INTERSPEECH.
[33] Frank K. Soong,et al. Synthesizing photo-real talking head via trajectory-guided sample selection , 2010, INTERSPEECH.
[34] Abdelmajid Ben Hamadou,et al. Off-line handwritten word recognition using multi-stream hidden Markov models , 2010, Pattern Recognit. Lett..
[35] Junichi Yamagishi,et al. Speech-driven lip motion generation with a trajectory HMM , 2008, INTERSPEECH.
[36] Timothy F. Cootes,et al. Active Appearance Models , 1998, ECCV.
[37] Jörn Ostermann,et al. Lifelike talking faces for interactive services , 2003, Proc. IEEE.
[38] Q. Summerfield. Some preliminaries to a comprehensive account of audio-visual speech perception. , 1987 .
[39] David Salesin,et al. Synthesizing realistic facial expressions from photographs , 1998, SIGGRAPH.
[40] Gérard Bailly,et al. LIPS2008: visual speech synthesis challenge , 2008, INTERSPEECH.
[41] Matthew Turk,et al. A Morphable Model For The Synthesis Of 3D Faces , 1999, SIGGRAPH.
[42] Giampiero Salvi,et al. Using HMMs and ANNs for mapping acoustic to visual speech , 1999 .
[43] Tomaso A. Poggio,et al. Reanimating Faces in Images and Video , 2003, Comput. Graph. Forum.
[44] Keiichi Tokuda,et al. HMM-based text-to-audio-visual speech synthesis , 2000, INTERSPEECH.