A comparative intelligibility study of single-microphone noise reduction algorithms.

The evaluation of intelligibility of noise reduction algorithms is reported. IEEE sentences and consonants were corrupted by four types of noise including babble, car, street and train at two signal-to-noise ratio levels (0 and 5 dB), and then processed by eight speech enhancement methods encompassing four classes of algorithms: spectral subtractive, sub-space, statistical model based and Wiener-type algorithms. The enhanced speech was presented to normal-hearing listeners for identification. With the exception of a single noise condition, no algorithm produced significant improvements in speech intelligibility. Information transmission analysis of the consonant confusion matrices indicated that no algorithm improved significantly the place feature score, significantly, which is critically important for speech recognition. The algorithms which were found in previous studies to perform the best in terms of overall quality, were not the same algorithms that performed the best in terms of speech intelligibility. The subspace algorithm, for instance, was previously found to perform the worst in terms of overall quality, but performed well in the present study in terms of preserving speech intelligibility. Overall, the analysis of consonant confusion matrices suggests that in order for noise reduction algorithms to improve speech intelligibility, they need to improve the place and manner feature scores.

[1]  John Mourjopoulos,et al.  Speech enhancement based on audible noise suppression , 1997, IEEE Trans. Speech Audio Process..

[2]  I. Cohen,et al.  Noise estimation by minima controlled recursive averaging for robust speech enhancement , 2002, IEEE Signal Processing Letters.

[3]  Pascal Scalart,et al.  Speech enhancement based on a priori signal to noise estimation , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[4]  IEEE Recommended Practice for Speech Quality Measurements , 1969, IEEE Transactions on Audio and Electroacoustics.

[5]  John H. L. Hansen,et al.  Evaluation of an auditory masked threshold noise suppression algorithm in normal-hearing and hearing-impaired listeners , 2003, Speech Commun..

[6]  Yi Hu,et al.  A Comparative Intelligibility Study of Speech Enhancement Algorithms , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[7]  Yi Hu,et al.  Speech enhancement based on wavelet thresholding the multitaper spectrum , 2004, IEEE Transactions on Speech and Audio Processing.

[8]  Wonyong Sung,et al.  A statistical model-based voice activity detection , 1999, IEEE Signal Processing Letters.

[9]  David Malah,et al.  Speech enhancement using a minimum mean-square error log-spectral amplitude estimator , 1984, IEEE Trans. Acoust. Speech Signal Process..

[10]  Erik G. Larsson,et al.  Cramer-Rao bound analysis of distributed positioning in sensor networks , 2004, IEEE Signal Processing Letters.

[11]  Benoît Champagne,et al.  Incorporating the human hearing properties in the signal subspace approach for speech enhancement , 2003, IEEE Trans. Speech Audio Process..

[12]  Philipos C. Loizou,et al.  A multi-band spectral subtraction method for enhancing speech corrupted by colored noise , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[13]  G. A. Miller,et al.  An Analysis of Perceptual Confusions Among Some English Consonants , 1955 .

[14]  Philipos C. Loizou,et al.  Speech Enhancement: Theory and Practice , 2007 .

[15]  Yi Hu,et al.  A generalized subspace approach for enhancing speech corrupted by colored noise , 2003, IEEE Trans. Speech Audio Process..

[16]  S. Boll,et al.  Suppression of acoustic noise in speech using spectral subtraction , 1979 .

[17]  David Pearce,et al.  The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions , 2000, INTERSPEECH.

[18]  Sven Nordholm,et al.  Spectral subtraction using reduced delay convolution and adaptive averaging , 2001, IEEE Trans. Speech Audio Process..

[19]  Nam C. Phamdo,et al.  Signal/noise KLT based approach for enhancing speech degraded by colored noise , 2000, IEEE Trans. Speech Audio Process..

[20]  Jae Lim,et al.  Evaluation of a correlation subtraction method for enhancing speech degraded by additive white noise , 1978 .

[21]  Yi Hu,et al.  Subspace algorithms for noise reduction in cochlear implants. , 2005, The Journal of the Acoustical Society of America.

[22]  Yi Hu,et al.  Subjective comparison and evaluation of speech enhancement algorithms , 2007, Speech Commun..