论文信息 - Rapid environment adaptation for robust speech recognition

Rapid environment adaptation for robust speech recognition

The paper proposes a rapid environment adaptation algorithm based on spectrum equalization (REALISE). In practical speech recognition applications, differences between training and testing environments often seriously diminish recognition accuracy. These environmental differences can be classified into two types: difference in additive noise and difference in multiplicative noise in the spectral domain. The proposed method calculates time-alignment between a testing utterance and the closest reference pattern to it, and then calculates the noise differences between the two according to the time-alignment. Then, the authors adapt all reference patterns to the testing environment using the differences. Finally, the testing utterance is recognized using the adapted reference patterns. In a 250 Japanese word recognition task, in which the training and testing microphones were of two different types, REALISE improved recognition accuracy from 87% to 96%.

[1] Steven F. Boll. A spectral subtraction algorithm for suppression of acoustic noise in speech , 1979, ICASSP.

[2] Richard M. Schwartz,et al. Enhancement of speech corrupted by acoustic noise , 1979, ICASSP.

[3] Richard M. Stern,et al. Environmental robustness in automatic speech recognition , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[4] Koichi Shinoda,et al. Speaker adaptation for demi-syllable based continuous density HMM , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[5] Alejandro Acero,et al. Acoustical and environmental robustness in automatic speech recognition , 1991 .

[6] Takao Watanabe,et al. Speaker-independent speech recognition based on hidden Markov model using demi-syllable units , 1993 .

[7] Takao Watanabe,et al. Speech recognition with rapid environment adaptation by spectrum equalization , 1994, ICSLP.