Relative mel-frequency cepstral coefficients compensation for robust telephone speech recognition

It is a crucial factor to find the robust and simple computation methods for the actual application of telephone speech recognition. In this paper, we propose a new channel compensation method, which uses a RASTA-like band-pass filter on the mel-frequency cepstral coefficients for robust telephone speech recognition. It is shown from the experiments that the proposed method, comparing with the RASTA processing, reduces the computational complexity without losing performance, and it is also better than CMS and two level CMS on the performance. We also verify that it is an effective approach to suppress very low modulation frequencies for robust telephone speech recognition.

[1]  Chin-Hui Lee,et al.  Robust speech recognition based on stochastic matching , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[2]  Chafic Mokbel,et al.  On-line adaptation of a speech recognizer to variations in telephone line conditions , 1993, EUROSPEECH.

[3]  Hynek Hermansky,et al.  Integrating RASTA-PLP into speech recognition , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[4]  Joseph Picone,et al.  Signal modeling techniques in speech recognition , 1993, Proc. IEEE.

[5]  Hynek Hermansky,et al.  Recognition of speech in additive and convolutional noise based on RASTA spectral processing , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  Peter No,et al.  Digital Coding of Waveforms , 1986 .

[7]  H Hermansky,et al.  Perceptual linear predictive (PLP) analysis of speech. , 1990, The Journal of the Acoustical Society of America.

[8]  Q. Summerfield,et al.  Auditory enhancement of changes in spectral amplitude. , 1987, The Journal of the Acoustical Society of America.

[9]  K. H. Barratt Digital Coding of Waveforms , 1985 .

[10]  Hynek Hermansky,et al.  RASTA-PLP speech analysis technique , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[11]  S. Furui,et al.  Cepstral analysis technique for automatic speaker verification , 1981 .

[12]  B. Mazor,et al.  Telephone channel normalization for automatic speech recognition , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.