Improving speech intelligibility in noise by SII-dependent preprocessing using frequency-dependent amplification and dynamic range compression

In this contribution, a new preprocessing algorithm to improve speech intelligibility in noise is proposed, which maintains the signal power before and after processing. The proposed AdaptDRC algorithm consists of two timeand frequency-dependent stages, which are both functions of the estimated SII. The first stage applies a timeand frequency-dependent amplification, while the second stage applies a timeand frequency-dependent dynamic range compression (DRC). Experiments with a competing speaker (CS) and a speech-shaped noise (SSN) show an increase in speech intelligibility for a wide range of SNRs for four different objective measures that are correlated with speech intelligibility. Listening tests conducted within the framework of the Hurricane Challenge with 175 subjects confirm these findings and show improvements of up to 20.5% in intelligibility for SSN and 12.3% for CS.

[1]  Yannis Stylianou,et al.  Speech-in-noise intelligibility improvement based on spectral shaping and dynamic range compression , 2012, INTERSPEECH.

[2]  Peter Vary,et al.  Near-End Listening Enhancement in the Presence of Bandpass Noises , 2012, ITG Conference on Speech Communication.

[3]  S. T. Goverts,et al.  Measuring the effects of reverberation and noise on sentence intelligibility for hearing-impaired listeners. , 2010, Journal of speech, language, and hearing research : JSLHR.

[4]  Peter Vary,et al.  Recursive Closed-Form Optimization of Spectral Audio Power Allocation for Near End Listening Enhancement , 2010, Sprachkommunikation.

[5]  R. H. Bernacki,et al.  Effects of noise on speech production: acoustic and perceptual analyses. , 1988, The Journal of the Acoustical Society of America.

[6]  Jesper Jensen,et al.  An Algorithm for Intelligibility Prediction of Time–Frequency Weighted Noisy Speech , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[7]  Hiroshi Sato,et al.  Listening difficulty as a subjective measure for evaluation of speech transmission performance in public spaces. , 2004, The Journal of the Acoustical Society of America.

[8]  R. Niederjohn,et al.  The enhancement of speech intelligibility in high noise levels by high-pass filtering followed by rapid amplitude compression , 1976 .

[9]  Sanjit K. Mitra,et al.  Tree-structured complementary filter banks using all-pass sections , 1987 .

[10]  I. Pollack,et al.  Effects of Differentiation, Integration, and Infinite Peak Clipping upon the Intelligibility of Speech , 1948 .

[11]  J. J. Higgins,et al.  The aligned rank transform for nonparametric factorial analyses using only anova procedures , 2011, CHI.

[12]  IEEE Recommended Practice for Speech Quality Measurements , 1969, IEEE Transactions on Audio and Electroacoustics.

[13]  Cassia Valentini-Botinhao,et al.  Intelligibility-enhancing speech modifications: the hurricane challenge , 2020, INTERSPEECH.

[14]  Jacob Benesty,et al.  Springer handbook of speech processing , 2007, Springer Handbooks.

[15]  K. S. Rhebergen,et al.  A Speech Intelligibility Index-based approach to predict the speech reception threshold for sentences in fluctuating noise for normal-hearing listeners. , 2005, The Journal of the Acoustical Society of America.