Effects of Lombard Reflex on the Performance of Deep-learning-based Audio-visual Speech Enhancement Systems
暂无分享,去创建一个
Jesper Jensen | Zheng-Hua Tan | Daniel Michelsanti | Sigurður Sigurðsson | Z. Tan | J. Jensen | S. Sigurðsson | D. Michelsanti
[1] Nathalie Henrich,et al. Speaking in noise: How does the Lombard effect improve acoustic contrasts between speech and ambient noise? , 2014, Comput. Speech Lang..
[2] John H. L. Hansen,et al. Analysis and Compensation of Lombard Speech Across Noise Type and Levels With Application to In-Set/Out-of-Set Speaker Recognition , 2009, IEEE Transactions on Audio, Speech, and Language Processing.
[3] DeLiang Wang,et al. On Training Targets for Supervised Speech Separation , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[4] H. Lane,et al. The Lombard Sign and the Role of Hearing in Speech , 1971 .
[5] Philipos C. Loizou,et al. Speech Enhancement: Theory and Practice , 2007 .
[6] DeLiang Wang,et al. Supervised Speech Separation Based on Deep Learning: An Overview , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[7] Davis E. King,et al. Dlib-ml: A Machine Learning Toolkit , 2009, J. Mach. Learn. Res..
[8] Kevin Wilson,et al. Looking to listen at the cocktail party , 2018, ACM Trans. Graph..
[9] Shmuel Peleg,et al. Visual Speech Enhancement , 2017, INTERSPEECH.
[10] Hiroshi Ishiguro,et al. Analysis of the visual Lombard effect and automatic recognition experiments , 2013, Comput. Speech Lang..
[11] J C Junqua,et al. The Lombard reflex and its role on human listeners and automatic speech recognizers. , 1993, The Journal of the Acoustical Society of America.
[12] Jon Barker,et al. The impact of the Lombard effect on audio and visual speech recognition systems , 2018, Speech Commun..
[13] Maëva Garnier,et al. Hyper-articulation in Lombard speech: An active communicative strategy to enhance visible speech cues? , 2018, The Journal of the Acoustical Society of America.
[14] DeLiang Wang,et al. Complex Ratio Masking for Monaural Speech Separation , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[15] H. Brumm,et al. The evolution of the Lombard effect: 100 years of psychoacoustic research , 2011 .
[16] Martin Cooke,et al. Speech production modifications produced in the presence of low-pass and high-pass filtered noise. , 2009, The Journal of the Acoustical Society of America.
[17] Andries P. Hekstra,et al. Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[18] Zheng-Hua Tan,et al. Speech enhancement using Long Short-Term Memory based recurrent Neural Networks for noise robust Speaker Verification , 2016, 2016 IEEE Spoken Language Technology Workshop (SLT).
[19] E. Owens,et al. An Introduction to the Psychology of Hearing , 1997 .
[20] Lawrence J. Raphael,et al. Speech Science Primer: Physiology, Acoustics, and Perception of Speech , 1980 .
[21] W. H. Sumby,et al. Visual contribution to speech intelligibility in noise , 1954 .
[22] J L Schwartz,et al. Audio-visual enhancement of speech in noise. , 2001, The Journal of the Acoustical Society of America.
[23] Martin Cooke,et al. Speech production modifications produced by competing talkers, babble, and stationary noise. , 2008, The Journal of the Acoustical Society of America.
[24] Yu Tsao,et al. Audio-Visual Speech Enhancement Using Multimodal Deep Convolutional Neural Networks , 2017, IEEE Transactions on Emerging Topics in Computational Intelligence.
[25] Jesper Jensen,et al. An Algorithm for Predicting the Intelligibility of Speech Masked by Modulated Noise Maskers , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[26] T. Wiley,et al. Recognition of speech produced in noise. , 2001, Journal of speech, language, and hearing research : JSLHR.
[27] Jon Barker,et al. An audio-visual corpus for speech perception and automatic speech recognition. , 2006, The Journal of the Acoustical Society of America.
[28] Steve C. Maddock,et al. A corpus of audio-visual Lombard speech with frontal and profile views. , 2018, The Journal of the Acoustical Society of America.
[29] Jesper Jensen,et al. On Training Targets and Objective Functions for Deep-learning-based Audio-visual Speech Enhancement , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[30] N. P. Erber. Auditory-visual perception of speech. , 1975, The Journal of speech and hearing disorders.
[31] Joon Son Chung,et al. The Conversation: Deep Audio-Visual Speech Enhancement , 2018, INTERSPEECH.