LOOK-A-HEAD SEQUENTIAL FEATURE VECTOR NORMALIZATION FOR NOISY SPEECH RECOGNITION

Cepstral mean subtraction (CMS), which is a simple long-term bias removal, is used to compensate for transmission and linear xed channel e ects. In order to process the non-linear channel, a two-level CMS was proposed where separate channel compensation is performed for segments that are classi ed as speech and for segments classied as background. In this paper, methods for extending the two-level CMS to real-time implementation is proposed using a nite number of look-a-head frame delay, which further reduces computation and memory requirements of the compensation process. The on-line bias compensation shows similar characteristic curve as that of batch-mode and has the e ect of greatly reducing the sensitivity of the recognizer to transmission noise variability.

[1]  S.K. Gupta,et al.  High-accuracy connected digit recognition for mobile applications , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[2]  M.G. Rahim,et al.  Signal conditioning techniques for robust speech recognition , 1996, IEEE Signal Processing Letters.

[3]  Biing-Hwang Juang,et al.  Signal bias removal by maximum likelihood estimation for robust telephone speech recognition , 1996, IEEE Trans. Speech Audio Process..

[4]  Rathinavelu Chengalvarayan,et al.  Robust energy normalization using speech/nonspeech discriminator for German connected digit recognition , 1999, EUROSPEECH.

[5]  Rathinavelu Chengalvarayan,et al.  A comparative study of hybrid modelling techniques for improved telephone speech recognition , 1998, ICSLP.

[6]  S. Furui,et al.  Cepstral analysis technique for automatic speaker verification , 1981 .

[7]  Chin-Hui Lee,et al.  Nonlinear compensation for stochastic matching , 1999, IEEE Trans. Speech Audio Process..

[8]  David L. Thomson,et al.  Use of periodicity and jitter as speech recognition features , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[9]  Wen Gao,et al.  Relative mel-frequency cepstral coefficients compensation for robust telephone speech recognition , 1997, EUROSPEECH.

[10]  Rathinavelu Chengalvarayan,et al.  Maximum-likelihood updates of HMM duration parameters for discriminative continuous speech recognition , 1998, ICSLP.