ADAPTATION AND COMPENSATION : APPROACHES TO MICR OPHONE AND SPEAKER INDEPENDENCE IN AUTOMATIC SPEECH RECOGNITION

This paper describes recent efforts by the CMU speech group to address the important problems of robustness to changes in environment and speaker. Results are presented in the context of the 1995 ARPA common Hub 3 evaluation of speech recorded through different microphones at different signal-to-noise ratios (SNRs). For speech that is considered to be of high quality we addressed the problem of speaker variability through a speaker normalization technique. For speech recorded at lower SNRs, we used a combination of environmental compensation techniques previously developed in our group. Speaker normalization reduced the relative error rate for clean speech by 3.5 percent, and the combination of environmental compensation with the use of noise-corrupted speech in the training process reduced the relative error rate for noisy speech by 54.9 percent.

[1]  Alejandro Acero,et al.  Acoustical and environmental robustness in automatic speech recognition , 1991 .

[2]  Mei-Yuh Hwang,et al.  The SPHINX-II speech recognition system: an overview , 1993, Comput. Speech Lang..

[3]  Richard M. Stern,et al.  A unified approach for robust speech recognition , 1995, EUROSPEECH.

[4]  S. Wegmann,et al.  Speaker normalization on conversational telephone speech , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.