The impact of the Lombard effect on audio and visual speech recognition systems
暂无分享,去创建一个
[1] David B Pisoni,et al. Some normative data on lip-reading skills (L). , 2011, The Journal of the Acoustical Society of America.
[2] Hiroshi Ishiguro,et al. Analysis of the visual Lombard effect and automatic recognition experiments , 2013, Comput. Speech Lang..
[3] David Pearce,et al. The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions , 2000, INTERSPEECH.
[4] Davis E. King,et al. Dlib-ml: A Machine Learning Toolkit , 2009, J. Mach. Learn. Res..
[5] Jeesun Kim,et al. Perceptual processing of audiovisual Lombard speech , 2006 .
[6] W. H. Sumby,et al. Visual contribution to speech intelligibility in noise , 1954 .
[7] Ning Ma,et al. Improving audio-visual speech recognition using deep neural networks with dynamic stream reliability estimates , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[8] Fu Jie Huang,et al. Consideration of Lombard effect for speechreading , 2001, 2001 IEEE Fourth Workshop on Multimedia Signal Processing (Cat. No.01TH8564).
[9] John H. L. Hansen,et al. Analysis and compensation of stressed and noisy speech with application to robust automatic recognition , 1988 .
[10] Virginia Best,et al. How Visual Cues for when to Listen Aid Selective Auditory Attention , 2012, Journal of the Association for Research in Otolaryngology.
[11] H. Brumm,et al. The evolution of the Lombard effect: 100 years of psychoacoustic research , 2011 .
[12] Jeesun Kim,et al. Auditory and auditory-visual Lombard speech perception by younger and older adults , 2013, AVSP.
[13] John Makhoul,et al. Speaker adaptive training: a maximum likelihood approach to speaker normalization , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[14] Kazuya Takeda,et al. Variability of Lombard effects under different noise conditions , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.
[15] Jon Barker,et al. The second ‘chime’ speech separation and recognition challenge: Datasets, tasks and baselines , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[16] Guy J. Brown,et al. Computational auditory scene analysis , 1994, Comput. Speech Lang..
[17] D W Massaro,et al. Perception of asynchronous and conflicting visual and auditory speech. , 1996, The Journal of the Acoustical Society of America.
[18] Brian R Glasberg,et al. Derivation of auditory filter shapes from notched-noise data , 1990, Hearing Research.
[19] Hani Yehia,et al. Audiovisual Lombard speech: reconciling production and perception , 2007, AVSP.
[20] Jeesun Kim,et al. Hearing Speech in Noise: Seeing a Loud Talker is Better , 2011, Perception.
[21] E. Owens,et al. An Introduction to the Psychology of Hearing , 1997 .
[22] Jon Barker,et al. Modelling speaker intelligibility in noise , 2007, Speech Commun..
[23] J S Perkell,et al. Effects of short-term auditory deprivation on speech production in adult cochlear implant users. , 1992, The Journal of the Acoustical Society of America.
[24] B. J. Stanton,et al. Robust recognition of loud and Lombard speech in the fighter cockpit environment , 1989, International Conference on Acoustics, Speech, and Signal Processing,.
[25] Naveen Parihar,et al. Performance analysis of the Aurora large vocabulary baseline system , 2004, 2004 12th European Signal Processing Conference.
[26] Martin Cooke,et al. A glimpsing model of speech perception in noise. , 2006, The Journal of the Acoustical Society of America.
[27] T. Wiley,et al. Recognition of speech produced in noise. , 2001, Journal of speech, language, and hearing research : JSLHR.
[28] Nathalie Henrich,et al. Speaking in noise: How does the Lombard effect improve acoustic contrasts between speech and ambient noise? , 2014, Comput. Speech Lang..
[29] John H. L. Hansen,et al. Analysis and Compensation of Lombard Speech Across Noise Type and Levels With Application to In-Set/Out-of-Set Speaker Recognition , 2009, IEEE Transactions on Audio, Speech, and Language Processing.
[30] Martin Cooke,et al. The contribution of changes in F0 and spectral tilt to increased intelligibility of speech produced in noise , 2009, Speech Commun..
[31] Lisa Tang,et al. Examining visible articulatory features in clear and plain speech , 2015, Speech Commun..
[32] John H. L. Hansen,et al. Source generator equalization and enhancement of spectral properties for robust speech recognition in noise and stress , 1995, IEEE Trans. Speech Audio Process..
[33] Ning Ma,et al. The PASCAL CHiME speech separation and recognition challenge , 2013, Comput. Speech Lang..
[34] Martin Cooke,et al. Speech production modifications produced by competing talkers, babble, and stationary noise. , 2008, The Journal of the Acoustical Society of America.
[35] B. J. Stanton,et al. Acoustic-phonetic analysis of loud and Lombard speech in simulated cockpit conditions , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.
[36] John H. L. Hansen,et al. A comparative study of traditional and newly proposed features for recognition of speech under stress , 2000, IEEE Trans. Speech Audio Process..
[37] J C Junqua,et al. The Lombard reflex and its role on human listeners and automatic speech recognizers. , 1993, The Journal of the Acoustical Society of America.
[38] Jeesun Kim,et al. The effect of seeing the interlocutor on auditory and visual speech production in noise , 2015, Speech Commun..
[39] Josephine Sullivan,et al. One millisecond face alignment with an ensemble of regression trees , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[40] M. Picheny,et al. Speaking clearly for the hard of hearing. II: Acoustic characteristics of clear and conversational speech. , 1986, Journal of speech and hearing research.
[41] V C Tartter,et al. Some acoustic effects of listening to noise on speech production. , 1993, The Journal of the Acoustical Society of America.
[42] John H. L. Hansen,et al. Robust speech recognition training via duration and spectral-based stress token generation , 1995, IEEE Trans. Speech Audio Process..
[43] S. J. Young,et al. Tree-based state tying for high accuracy acoustic modelling , 1994 .
[44] R. Patel,et al. The influence of linguistic content on the Lombard effect. , 2008, Journal of speech, language, and hearing research : JSLHR.
[45] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[46] Eric Vatikiotis-Bateson,et al. Auditory, but perhaps not visual, processing of Lombard speech , 2006 .
[47] Shimon Whiteson,et al. LipNet: End-to-End Sentence-level Lipreading , 2016, 1611.01599.
[48] Martin Cooke,et al. The contribution of durational and spectral changes to the Lombard speech intelligibility benefit. , 2014, The Journal of the Acoustical Society of America.
[49] R. H. Bernacki,et al. Effects of noise on speech production: acoustic and perceptual analyses. , 1988, The Journal of the Acoustical Society of America.
[50] Jon Barker,et al. An audio-visual corpus for speech perception and automatic speech recognition. , 2006, The Journal of the Acoustical Society of America.
[51] Guy J. Brown,et al. Robust audiovisual speech recognition using noise-adaptive linear discriminant analysis , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).