Channel Effect Compensation in LSF Domain

This study addresses the problem of channel effect in the line spectrum frequency (LSF) domain. LSF parameters are the popular speech features encoded in the bit stream for low bit-rate speech transmission. A method of channel effect compensation in LSF domain is of interest for robust speech recognition on mobile communication and Internet systems. If the bit error rate in the transmission of digital encoded speech is negligibly low, the channel distortion comes mainly from the microphone or the handset. When the speech signal is represented in terms of the phase of inverse filter derived from LP analysis, this channel distortion can be expressed in terms of the channel phase. Further derivation shows that the mean subtraction performed on the phase of inverse filter can minimize the channel effect. Based on this finding, an iterative algorithm is proposed to remove the bias on LSFs due to channel effect. The experiments on the simulated channel distorted speech and the real telephone speech are conducted to show the effectiveness of our proposed method. The performance of the proposed method is comparable to that of cepstral mean normalization (CMN) in using cepstral coefficients.

[1]  Kuldip K. Paliwal,et al.  Efficient vector quantization of LPC parameters at 24 bits/frame , 1993, IEEE Trans. Speech Audio Process..

[2]  Chin-Hui Lee,et al.  A maximum-likelihood approach to stochastic matching for robust speech recognition , 1996, IEEE Trans. Speech Audio Process..

[3]  Mari Ostendorf,et al.  Reducing the effects of linear channel distortion on continuous speech recognition , 1999, IEEE Trans. Speech Audio Process..

[4]  Larry P. Heck,et al.  Robust text-independent speaker identification over telephone channels , 1999, IEEE Trans. Speech Audio Process..

[5]  Jen-Tzung Chien,et al.  Telephone speech recognition based on Bayesian adaptation of hidden Markov models , 1997, Speech Commun..

[6]  Lou Boves,et al.  Comparison of channel normalisation techniques for automatic speech recognition over the phone , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[7]  Hong Kook Kim,et al.  A bitstream-based front-end for wireless speech recognition on IS-136 communications system , 2001, IEEE Trans. Speech Audio Process..

[8]  Kuldip K. Paliwal A study of LSF representation for speaker-dependent and speaker-independent HMM-based speech recognition systems , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[9]  S. Furui,et al.  Cepstral analysis technique for automatic speaker verification , 1981 .

[10]  A. T. Yu,et al.  Effect of noise on line spectrum frequency and a robust speech recognition method for the low bit-rate encoded speech , 1999 .

[11]  Rajiv Laroia,et al.  Robust and efficient quantization of speech LSP parameters using structured vector quantizers , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[12]  Hsiao-Chuan Wang,et al.  Compensation of channel effect on line spectrum frequencies , 2002, INTERSPEECH.

[13]  Levent M. Arslan,et al.  Voice conversion by codebook mapping of line spectral frequencies and excitation spectrum , 1997, EUROSPEECH.

[14]  F. Itakura Line spectrum representation of linear predictor coefficients of speech signals , 1975 .

[15]  B. Mazor,et al.  Telephone channel normalization for automatic speech recognition , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[16]  Richard M. Stern,et al.  Efficient Cepstral Normalization for Robust Speech Recognition , 1993, HLT.

[17]  Hsiao-Chuan Wang,et al.  A study on the recognition of low bit-rate encoded speech , 1998, ICSLP.

[18]  Hynek Hermansky,et al.  RASTA processing of speech , 1994, IEEE Trans. Speech Audio Process..

[19]  Kuldip K. Paliwal A study of line spectrum pair frequencies for speech recognition , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[20]  Bhiksha Raj,et al.  Distributed speech recognition with codec parameters , 2001, IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01..