Noise robust speech recognition using spectral subtraction and F0 information extracted by Hough transform

We propose a noise robust speech recognition method based on combining novel features extracted from fundamental frequency (F0) information and spectral subtraction. F0 features have been shown to be effective in speech recognition in noisy environments. Recently, F0 features obtained by Hough transform were developed for concatenated digit recognition and significantly improved recognition performance of noisy speech. This paper proposes novel features based on Hough transform for large-vocabulary continuous speech recognition. In addition, spectral subtraction is applied before Hough transform to remove static noise. The proposed method was tested using the Japanese Newspaper Article Sentences (JNAS) database. Word accuracy was improved in all noise conditions, with the best absolute improvement being 2.6 points in percentage when station noise was added at 10 dB SNR.

[1]  Hermann Ney,et al.  Robust speech recognition using a voiced-unvoiced feature , 2002, INTERSPEECH.

[2]  Hynek Hermansky,et al.  RASTA processing of speech , 1994, IEEE Trans. Speech Audio Process..

[3]  S. Boll,et al.  Suppression of acoustic noise in speech using spectral subtraction , 1979 .

[4]  Liang Gu,et al.  Perceptual harmonic cepstral coefficients for speech recognition in noisy environment , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[5]  Mark J. F. Gales,et al.  Robust continuous speech recognition using parallel model combination , 1996, IEEE Trans. Speech Audio Process..

[6]  Sadaoki Furui,et al.  Noise robust speech recognition using F0 contour extracted by hough transform , 2002, INTERSPEECH.