Signal properties reducing intelligibility of speech after noise reduction

The effect of noise reduction on the intelligibility of speech in noise is poorly understood. Although the SNR of noisy speech is improved by the removal of more noise than speech from the signal, the expected increase in intelligibility does not typically occur. To account for these deleterious effects we present an orthogonal decomposition of the signal intensity envelopes at the output of a filterbank. The noisy speech envelopes are decomposed into components indicating (1) the coherence of speech across audio bands; (2) the distortion of the speech envelope; and (3) the speechiness of the noise. By modelling the results of a listening experiment we show that envelope distortion can largely account for the deleterious effects of noise reduction; although reduced coherence could also play a role at low SNRs. There was little evidence for the idea that increased speechiness of the noise contributed to the poorer intelligibility after noise reduction.

[1]  Jont B. Allen How do humans process and recognize speech , 1993 .

[2]  Emily Buss,et al.  Masking release for words in amplitude-modulated noise as a function of modulation rate and task. , 2009, The Journal of the Acoustical Society of America.

[3]  B. Kollmeier,et al.  Modeling auditory processing of amplitude modulation. I. Detection and masking with narrow-band carriers. , 1997, The Journal of the Acoustical Society of America.

[4]  IEEE Recommended Practice for Speech Quality Measurements , 1969, IEEE Transactions on Audio and Electroacoustics.

[5]  Richard M. Schwartz,et al.  Enhancement of speech corrupted by acoustic noise , 1979, ICASSP.

[6]  Andrew Faulkner,et al.  Perceptual adaptation by normally hearing listeners to a simulated "hole" in hearing. , 2006, The Journal of the Acoustical Society of America.

[7]  B. Moore,et al.  Quantifying the effects of fast-acting compression on the envelope of speech. , 2007, The Journal of the Acoustical Society of America.

[8]  Rainer Martin,et al.  Noise power spectral density estimation based on optimal smoothing and minimum statistics , 2001, IEEE Trans. Speech Audio Process..

[9]  T. Dau Modeling auditory processing of amplitude modulation , 1997 .

[10]  C Ludvigsen,et al.  Evaluation of a noise reduction method--comparison between observed scores and scores predicted from STI. , 1993, Scandinavian audiology. Supplementum.

[11]  Albert S. Bregman,et al.  The Auditory Scene. (Book Reviews: Auditory Scene Analysis. The Perceptual Organization of Sound.) , 1990 .

[12]  Li Xu,et al.  Spectral and temporal cues for phoneme recognition in noise. , 2007, The Journal of the Acoustical Society of America.

[13]  I M Noordhoek,et al.  Effect of reducing temporal intensity modulations on sentence intelligibility. , 1997, The Journal of the Acoustical Society of America.

[14]  R V Shannon,et al.  Speech Recognition with Primarily Temporal Cues , 1995, Science.

[15]  Bryan E Pfingst,et al.  Relative contributions of spectral and temporal cues for phoneme recognition. , 2005, The Journal of the Acoustical Society of America.