Generating Talking Face Landmarks from Speech
暂无分享,去创建一个
Chenliang Xu | Zhiyao Duan | Ross K. Maddox | Sefik Emre Eskimez | S. Eskimez | Chenliang Xu | R. Maddox | Z. Duan
[1] Jon Barker,et al. An audio-visual corpus for speech perception and automatic speech recognition. , 2006, The Journal of the Acoustical Society of America.
[2] Lei Xie,et al. A coupled HMM approach to video-realistic speech animation , 2007, Pattern Recognit..
[3] Jonathan G. Fiscus,et al. Darpa Timit Acoustic-Phonetic Continuous Speech Corpus CD-ROM {TIMIT} | NIST , 1993 .
[4] Ira Kemelmacher-Shlizerman,et al. Synthesizing Obama , 2017, ACM Trans. Graph..
[5] Matthew Brand,et al. Voice puppetry , 1999, SIGGRAPH.
[6] Paul L. Rosin,et al. Speech driven facial animation using a hidden Markov coarticulation model , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..
[7] Hai Xuan Pham,et al. End-to-end Learning for 3D Facial Animation from Raw Waveforms of Speech , 2017, ArXiv.
[8] Adrian K. C. Lee,et al. Auditory selective attention is enhanced by a task-irrelevant temporally coherent visual stimulus in human listeners , 2015, eLife.
[9] Lei Xie,et al. Photo-real talking head with deep bidirectional LSTM , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[10] Paul L. Rosin,et al. VIDEO REALISTIC TALKING HEADS USING HIERARCHICAL NON-LINEAR SPEECH-APPEARANCE MODELS , 2003 .
[11] Frank K. Soong,et al. A new language independent, photo-realistic talking head driven by voice only , 2013, INTERSPEECH.
[12] Frank K. Soong,et al. Text Driven 3D Photo-Realistic Talking Head , 2011, INTERSPEECH.
[13] R Carhart,et al. An expanded test for speech discrimination utilizing CNC monosyllabic words. Northwestern University Auditory Test No. 6. SAM-TR-66-55. , 1966, [Technical report] SAM-TR. USAF School of Aerospace Medicine.
[14] Joon Son Chung,et al. You said that? , 2017, BMVC.
[15] Graeme M. Clark,et al. Factors Predicting Postoperative Sentence Scores in Postlinguistically Deaf Adult Cochlear Implant Patients , 1992, The Annals of otology, rhinology, and laryngology.
[16] Lucas D. Terissi,et al. Audio-to-Visual Conversion Via HMM Inversion for Speech-Driven Facial Animation , 2008, SBIA.
[17] Davis E. King,et al. Dlib-ml: A Machine Learning Toolkit , 2009, J. Mach. Learn. Res..
[18] David Taylor. Hearing by Eye: The Psychology of Lip-Reading , 1988 .
[19] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[20] Jenq-Neng Hwang,et al. Hidden Markov Model Inversion for Audio-to-Visual Conversion in an MPEG-4 Facial Animation System , 2001, J. VLSI Signal Process..
[21] Hai Xuan Pham,et al. Speech-Driven 3D Facial Animation with Implicit Emotional Awareness: A Deep Learning Approach , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[22] Björn Stenger,et al. Expressive visual text-to-speech as an assistive technology for individuals with autism spectrum conditions , 2016, Comput. Vis. Image Underst..
[23] Jonathan G. Fiscus,et al. DARPA TIMIT:: acoustic-phonetic continuous speech corpus CD-ROM, NIST speech disc 1-1.1 , 1993 .
[24] Mark J. F. Gales,et al. Photo-realistic expressive text to talking head synthesis , 2013, INTERSPEECH.