Evaluating Near End Listening Enhancement Algorithms in Realistic Environments

Speech playback (e.g., TV, radio, public address) becomes harder to understand in the presence of noise and reverberation. NELE (Near End Listening Enhancement) algorithms can improve intelligibility by modifying the signal before it is played back. Substantial intelligibility improvements have been achieved in the lab for both natural and synthetic speech. However, evidence is still scarce on how these algorithms work under conditions of realistic noise and reverberation. We present a realistic test platform, featuring two representative everyday scenarios in which speech playback may occur (in the presence of both noise and reverberation): a domestic space (living room) and a public space (cafeteria). The generated stimuli are evaluated by measuring keyword accuracy rates in a listening test with normal hearing subjects. We use the new platform to compare three state-of-theart NELE algorithms, employing either noise-adaptive or nonadaptive strategies, and with or without compensation for reverberation.

[1]  Yannis Stylianou,et al.  Adaptive Gain Control for Enhanced Speech Intelligibility Under Reverberation , 2016, IEEE Signal Processing Letters.

[2]  Steven van de Par,et al.  A Speech Preprocessing Method Based on Overlap-Masking Reduction to Increase Intelligibility in Reverberant Environments , 2017 .

[3]  Jan Rennies,et al.  Improving speech intelligibility in noise by SII-dependent preprocessing using frequency-dependent amplification and dynamic range compression , 2013, INTERSPEECH.

[4]  Cassia Valentini-Botinhao,et al.  Intelligibility-enhancing speech modifications: the hurricane challenge , 2020, INTERSPEECH.

[5]  Peter Vary,et al.  Near End Listening Enhancement: Speech Intelligibility Improvement in Noisy Environments , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[6]  Jesper Jensen,et al.  On Optimal Linear Filtering of Speech for Near-End Listening Enhancement , 2013, IEEE Signal Processing Letters.

[7]  Yannis Stylianou,et al.  Speech-in-noise intelligibility improvement based on spectral shaping and dynamic range compression , 2012, INTERSPEECH.

[8]  Keisuke Kinoshita,et al.  Effects of suppressing steady-state portions of speech on intelligibility in reverberant environments , 2002 .

[9]  IEEE Recommended Practice for Speech Quality Measurements , 1969, IEEE Transactions on Audio and Electroacoustics.

[10]  J Rennies,et al.  Evaluation of a near-end listening enhancement algorithm by combined speech intelligibility and listening effort measurements. , 2018, The Journal of the Acoustical Society of America.

[11]  Jan Rennies,et al.  Speech-in-noise enhancement using amplification and dynamic range compression controlled by the speech intelligibility index. , 2015, The Journal of the Acoustical Society of America.

[12]  Yannis Stylianou,et al.  Evaluating the intelligibility benefit of speech modifications in known noise conditions , 2013, Speech Commun..

[13]  Ning Ma,et al.  The CHiME corpus: a resource and a challenge for computational hearing in multisource environments , 2010, INTERSPEECH.

[14]  Yan Tang,et al.  A Study on the Relationship between the Intelligibility and Quality of Algorithmically-Modified Speech for Normal Hearing Listeners , 2017 .

[15]  Simon King,et al.  Mel cepstral coefficient modification based on the Glimpse Proportion measure for improving the intelligibility of HMM-generated synthetic speech in noise , 2012, INTERSPEECH.