论文信息 - Robust Speech Recognition with MSC / DRA Feature Extraction on Modulation Spectrum Domain

Robust Speech Recognition with MSC / DRA Feature Extraction on Modulation Spectrum Domain

This report introduces noise robust speech recognition and proposes advanced speech analysis techniques named MSC (Modulation Spectrum Control)/DRA (Dynamic Range Adjustment). The dynamic range of cepstrum obtained from noisy speech is usually smaller than that from the same speech without noise since some speech features are hidden in noise. This difference may cause recognition errors. Therefore the adjustment of dynamic range can realize the accurate extraction of speech features. The proposed techniques DRA and MSC focus on the speech feature adjustment. DRA normalizes dynamic ranges and MSC eliminates the noise corruption of speech feature parameters. The experiments on isolated word recognition were carried out using 40 male and female speakers for training and 5 male and female speakers for recognition. The result of recognition rate improving from 17% to 64% versus running car noise at -10dB SNR is shown as an example.

Yoshikazu Miyanaga | Naoya Wada

[1] Roger K. Moore,et al. Hidden Markov model decomposition of speech and noise , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[2] Mark J. F. Gales,et al. Cepstral parameter compensation for HMM recognition in noise , 1993, Speech Commun..

[3] K Aikawa,et al. Cepstral representation of speech motivated by time-frequency masking: an application to speech recognition. , 1996, The Journal of the Acoustical Society of America.

[4] S. Boll,et al. Suppression of acoustic noise in speech using spectral subtraction , 1979 .

[5] J. Tierney,et al. A study of LPC analysis of speech in additive noise , 1980 .

[6] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[7] Sadaoki Furui,et al. Speaker-independent isolated word recognition using dynamic features of speech spectrum , 1986, IEEE Trans. Acoust. Speech Signal Process..

[8] Kiyohiro Shikano,et al. Recognition of noisy speech by composition of hidden Markov models , 1993, EUROSPEECH.

[9] Noboru Hayasa. Running spectrum filtering in speech recognition , 2002 .

[10] Hynek Hermansky,et al. RASTA processing of speech , 1994, IEEE Trans. Speech Audio Process..

[11] S. Kay. Noise compensation for autoregressive spectral estimates , 1980 .