Continuous visual speech recognition for multimodal fusion
暂无分享,去创建一个
[1] H. McGurk,et al. Hearing lips and seeing voices , 1976, Nature.
[2] Chalapathy Neti,et al. Recent advances in the automatic recognition of audiovisual speech , 2003, Proc. IEEE.
[3] Alex Acero,et al. Spoken Language Processing , 2001 .
[4] John Platt,et al. Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .
[5] Richard Bowden,et al. Learning temporal signatures for Lip Reading , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).
[6] Shaogang Gong,et al. Audio- and Video-based Biometric Person Authentication , 1997, Lecture Notes in Computer Science.
[7] Matti Pietikäinen,et al. Lipreading: A Graph Embedding Approach , 2010, 2010 20th International Conference on Pattern Recognition.
[8] Jiri Matas,et al. XM2VTSDB: The Extended M2VTS Database , 1999 .
[9] Naomi Harte,et al. Viseme definitions comparison for visual-only speech recognition , 2011, 2011 19th European Signal Processing Conference.
[10] Stephen J. Cox,et al. The challenge of multispeaker lip-reading , 2008, AVSP.
[11] F ChenStanley,et al. An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.
[12] Tuomas Virtanen,et al. Noise robust exemplar-based connected digit recognition , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.
[13] Timothy F. Cootes,et al. Extraction of Visual Features for Lipreading , 2002, IEEE Trans. Pattern Anal. Mach. Intell..
[14] Matti Pietikäinen,et al. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON MULTIMEDIA 1 Lipreading with Local Spatiotemporal Descriptors , 2022 .
[15] Petros Maragos,et al. Adaptive Multimodal Fusion by Uncertainty Compensation With Application to Audiovisual Speech Recognition , 2009, IEEE Transactions on Audio, Speech, and Language Processing.
[16] Jon Barker,et al. An audio-visual corpus for speech perception and automatic speech recognition. , 2006, The Journal of the Acoustical Society of America.
[17] Algirdas Pakstas,et al. MPEG-4 Facial Animation: The Standard,Implementation and Applications , 2002 .
[18] Jean-Philippe Thiran,et al. Information Theoretic Feature Extraction for Audio-Visual Speech Recognition , 2009, IEEE Transactions on Signal Processing.
[19] A Markides,et al. Speechreading (lipreading). , 1979, Child: care, health and development.
[20] James R. Glass,et al. A segment-based audio-visual speech recognizer: data collection, development, and initial experiments , 2004, ICMI '04.
[21] Moshe Mahler,et al. Dynamic units of visual speech , 2012, SCA '12.
[22] Stanley F. Chen,et al. An empirical study of smoothing techniques for language modeling , 1999 .
[23] Hichem Sahbi,et al. Designing relevant features for visual speech recognition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[24] Gérard Bailly,et al. LIPS2008: visual speech synthesis challenge , 2008, INTERSPEECH.
[25] Yoni Bauduin,et al. Audio-Visual Speech Recognition , 2004 .