Clustering Persian viseme using phoneme subspace for developing visual speech application
暂无分享,去创建一个
Mohammad Mahdi Dehshibi | Mohammad Aghaahmadi | Azam Bastanfard | Mahmood Fazlali | A. Bastanfard | M. Fazlali | Mohammad Aghaahmadi
[1] Gerasimos Potamianos,et al. An image transform approach for HMM based automatic lipreading , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).
[2] Thoms M. Levergood,et al. DECface: A system for synthetic face applications , 1995, Multimedia Tools and Applications.
[3] Kevin P. Murphy,et al. A coupled HMM for audio-visual speech recognition , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[4] Petr Císař,et al. Viseme analysis for speech-driven facial animation for Czech audio-visual speech synthesis , 2005 .
[5] M. Pichora-Fuller,et al. Coarticulation effects in lipreading. , 1982, Journal of speech and hearing research.
[6] Javier Melenchón,et al. Objective viseme extraction and audiovisual uncertainty: estimation limits between auditory and visual modes , 2007, AVSP.
[7] Mikko Sams,et al. Parameterized visual speech synthesis and its evaluation , 2000, 2000 10th European Signal Processing Conference.
[8] Juergen Luettin,et al. Audio-Visual Automatic Speech Recognition: An Overview , 2004 .
[9] Anton Nijholt,et al. Classifying Visemes for Automatic Lipreading , 1999, TSD.
[10] Christopher G. Harris,et al. A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.
[11] Ilse Lehiste,et al. Coarticulation Effects in the Identification of Final Plosives , 1972 .
[12] Caroline Henton,et al. Generating and manipulating emotional synthetic speech on a personal computer , 1996, Multimedia Tools and Applications.
[13] J. A. Hartigan,et al. A k-means clustering algorithm , 1979 .
[14] John H. L. Hansen,et al. DSP for In-Vehicle and Mobile Systems , 2014 .
[15] Bernhard Schölkopf,et al. Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.
[16] Nasrollah Moghaddam Charkari,et al. Multimodal information fusion application to human emotion recognition from face and speech , 2010, Multimedia Tools and Applications.
[17] M. Turk,et al. Eigenfaces for Recognition , 1991, Journal of Cognitive Neuroscience.
[18] C. G. Fisher,et al. Confusions among visually perceived consonants. , 1968, Journal of speech and hearing research.
[19] Olle Bälter,et al. Wizard-of-Oz test of ARTUR: a computer-based speech training system with articulation correction , 2005, Assets '05.
[20] Tony Ezzat,et al. Visual Speech Synthesis by Morphing Visemes , 2000, International Journal of Computer Vision.
[21] Mohammad Aghaahmadi,et al. Persian Viseme Classification for Developing Visual Speech Training Application , 2009, PCM.
[22] Walid Mahdi,et al. Lip Localization and Viseme Classification for Visual Speech Recognition , 2013, ArXiv.
[23] Aggelos K. Katsaggelos,et al. Frame Rate and Viseme Analysis for Multimedia Applications to Assist Speechreading , 1998, J. VLSI Signal Process..
[24] Horst Bunke,et al. Sentence Lipreading Using Hidden Markov Model with Integrated Grammar , 2001, Int. J. Pattern Recognit. Artif. Intell..
[25] Bernard Tiddeman,et al. Prototyping and transforming visemes for animated speech , 2002, Proceedings of Computer Animation 2002 (CA 2002).
[26] Hitoshi Kiya,et al. Proceedings of the Advances in multimedia information processing, and 11th Pacific Rim conference on Multimedia: Part II , 2010 .
[27] Phillip A. Laplante,et al. A multimedia speech learning system for the hearing impaired , 1996, Multimedia Tools and Applications.
[28] Teuvo Kohonen,et al. The self-organizing map , 1990 .
[29] Sherif Abdou,et al. Audio-visual phoneme classification for pronunciation training applications , 2007, INTERSPEECH.
[30] Wladyslaw Skarbek,et al. Viseme Classification for Talking Head Application , 2005, CAIP.
[31] Mohammad Aghaahmadi,et al. The Persian Linguistic Based Audio-Visual Data Corpus, AVA II, Considering Coarticulation , 2010, MMM.
[32] Hakan Erdogan,et al. Audio-visual speech recognition in vehicular noise using a multi-classifier approach , 2007 .
[33] Christophe Garcia,et al. A Wavelet-based Framework for Face Recognition , 1998 .
[34] Hedvig Kjellström,et al. Audiovisual-to-articulatory inversion , 2009, Speech Commun..
[35] Mikhail Belkin,et al. Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.
[36] Mohammad Aghaahmadi,et al. A comprehensive audio-visual corpus for teaching sound Persian phoneme articulation , 2009, 2009 IEEE International Conference on Systems, Man and Cybernetics.
[37] J. H. Ward. Hierarchical Grouping to Optimize an Objective Function , 1963 .
[38] Anders Löfqvist. Vowel-to-vowel coarticulation in Japanese: the effect of consonant duration. , 2009, The Journal of the Acoustical Society of America.
[39] R. Safabakhsh,et al. AUT-Talk: A Farsi Talking Head , 2006, 2006 2nd International Conference on Information & Communication Technologies.
[40] Azam Bastanfard,et al. A Novel Multimedia Educational Speech Therapy System for Hearing Impaired Children , 2010, PCM.