Improving performance of spectral subtraction in speech recognition using a model for additive noise

Addresses the problem of speech recognition with signals corrupted by additive noise at moderate signal-to-noise ratio (SNR). A model for additive noise is presented and used to compute the uncertainty about the hidden clean signal so as to weight the estimation provided by spectral subtraction. Weighted dynamic time warping (DTW) and Viterbi (HMM) algorithms are tested, and the results show that weighting the information along the signal can substantially increase the performance of spectral subtraction, an easily implemented technique, even with a poor estimation for noise and without using any information about the speaker. It is also shown that the weighting procedure can reduce the error rate when cepstral mean normalization is also used to cancel the convolutional noise.

[1]  Mervyn A. Jack,et al.  Weighted matching algorithms and reliability in noise cancelling by spectral subtraction , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  Mark J. F. Gales,et al.  Model-based techniques for noise robust speech recognition , 1995 .

[3]  Dirk Van Compernolle Noise adaptation in a hidden Markov model speech recognition system , 1989 .

[4]  Hidefumi Kobatake,et al.  Degraded word recognition based on segmental signal-to-noise ratio weighting , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[5]  T. Claes,et al.  SNR-normalisation for robust speech recognition , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.