Adaptive Decision Fusion for Audio-Visual Speech Recognition
暂无分享,去创建一个
[1] W. Lutzenberger,et al. Sequential audiovisual interactions during speech perception: A whole-head MEG study , 2007, Neuropsychologia.
[2] Dennis H. Klatt,et al. Speech perception: a model of acoustic–phonetic analysis and lexical access , 1979 .
[3] P F Seitz,et al. The use of visible speech cues for improving auditory detection of spoken sentences. , 2000, The Journal of the Acoustical Society of America.
[4] Herman J. M. Steeneken,et al. Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems , 1993, Speech Commun..
[5] Ruth Campbell,et al. Tracing Lip Movements: Making Speech Visible , 1988 .
[6] Jeesun Kim,et al. Investigating the audio-visual speech detection advantage , 2004, Speech Commun..
[7] C. Benoît,et al. Effects of phonetic context on audio-visual intelligibility of French. , 1994, Journal of speech and hearing research.
[8] D. Pisoni,et al. Auditory-visual speech perception and synchrony detection for speech and nonspeech signals. , 2006, The Journal of the Acoustical Society of America.
[9] Alex Acero,et al. Spoken Language Processing: A Guide to Theory, Algorithm and System Development , 2001 .
[10] Vladimir Pavlovic,et al. Toward multimodal human-computer interface , 1998, Proc. IEEE.
[11] Chalapathy Neti,et al. Stream confidence estimation for audio-visual speech recognition , 2000, INTERSPEECH.
[12] J. Schwartz,et al. Seeing to hear better: evidence for early audio-visual interactions in speech identification , 2004, Cognition.
[13] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.
[14] A. Adjoudani,et al. On the Integration of Auditory and Visual Parameters in an HMM-based ASR , 1996 .
[15] James L. McClelland,et al. The TRACE model of speech perception , 1986, Cognitive Psychology.
[16] Farzin Deravi,et al. A review of speech-based bimodal recognition , 2002, IEEE Trans. Multim..
[17] Trent W. Lewis,et al. Sensor Fusion Weighting Measures in Audio-Visual Speech Recognition , 2004, ACSC.
[18] P. Arnold,et al. Bisensory augmentation: a speechreading advantage when speech is clearly audible and intact. , 2001, British journal of psychology.
[19] Y. Tohkura,et al. Inter-language differences in the influence of visual cues in speech perception. , 1993 .
[20] Simon Lucey. An Evaluation of Visual Speech Features for the Tasks of Speech and Speaker Recognition , 2003, AVBPA.
[21] Q. Summerfield. Some preliminaries to a comprehensive account of audio-visual speech perception. , 1987 .
[22] H. McGurk,et al. Hearing lips and seeing voices , 1976, Nature.
[23] Stan Davis,et al. Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .
[24] Kuntal Sengupta,et al. Lip geometric features for human-computer interaction using bimodal speech recognition: comparison and analysis , 2004, Speech Commun..
[25] Juergen Luettin,et al. Audio-Visual Speech Modeling for Continuous Speech Recognition , 2000, IEEE Trans. Multim..
[26] M. Ernst,et al. Humans integrate visual and haptic information in a statistically optimal fashion , 2002, Nature.
[27] D. Massaro. Speech Perception By Ear and Eye: A Paradigm for Psychological Inquiry , 1989 .
[28] Hero Wit,et al. Activation in Primary Auditory Cortex during Silent Lipreading Is Determined by Sex , 2007, Audiology and Neurotology.
[29] Sabri Gurbuz,et al. Application of affine-invariant Fourier descriptors to lipreading for audio-visual speech recognition , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[30] C. Fowler. An event approach to the study of speech perception from a direct realist perspective , 1986 .
[31] E. Bullmore,et al. Activation of auditory cortex during silent lipreading. , 1997, Science.
[32] Dominic W. Massaro,et al. Speechreading: illusion or window into pattern recognition , 1999, Trends in Cognitive Sciences.
[33] Yochai Konig,et al. "Eigenlips" for robust speech recognition , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.
[34] A. Liberman,et al. The motor theory of speech perception revised , 1985, Cognition.
[35] L. Braida. Crossmodal Integration in the Identification of Consonant Segments , 1991, The Quarterly journal of experimental psychology. A, Human experimental psychology.
[36] T.,et al. Training Feedforward Networks with the Marquardt Algorithm , 2004 .
[37] Cheol Hoon Park,et al. Training Hidden Markov Models by Hybrid Simulated Annealing for Visual Speech Recognition , 2006, 2006 IEEE International Conference on Systems, Man and Cybernetics.
[38] Stephen J. Cox,et al. Audiovisual speech recognition using multiscale nonlinear image decomposition , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.
[39] C. Benoît,et al. 28. The Intrinsic Bimodality of Speech Communication and the Synthesis of Talking Faces , 2000 .
[40] B. Stein,et al. The Merging of the Senses , 1993 .
[41] Alexandrina Rogozan,et al. Adaptive fusion of acoustic and visual sources for automatic speech recognition , 1998, Speech Commun..
[42] J Robert-Ribes,et al. Complementarity and synergy in bimodal speech: auditory, visual, and audio-visual identification of French oral vowels in noise. , 1998, The Journal of the Acoustical Society of America.
[43] E Macaluso,et al. Spatial and temporal factors during processing of audiovisual speech: a PET study , 2004, NeuroImage.
[44] D. Massaro. Perceiving talking faces: from speech perception to a behavioral principle , 1999 .
[45] Juergen Luettin,et al. A comparison of model and transform-based visual features for audio-visual LVCSR , 2001, IEEE International Conference on Multimedia and Expo, 2001. ICME 2001..
[46] John J. Foxe,et al. Do you see what I am saying? Exploring visual enhancement of speech comprehension in noisy environments. , 2006, Cerebral cortex.
[47] M. Sams,et al. Primary auditory cortex activation by visual speech: an fMRI study at 3 T , 2005, Neuroreport.
[48] Cheol Hoon Park,et al. Robust Audio-Visual Speech Recognition Based on Late Integration , 2008, IEEE Transactions on Multimedia.
[49] Kurt Hornik,et al. Multilayer feedforward networks are universal approximators , 1989, Neural Networks.