HMM-based channel error mitigation and its application to distributed speech recognition

The emergence of distributed speech recognition has generated the need to mitigate the degradations that the transmission channel introduces in the speech features used for recognition. This work proposes a hidden Markov model (HMM) framework from which different mitigation techniques oriented to wireless channels can be derived. First, we study the performance of two techniques based on the use of a minimum mean square error (MMSE) esti- mation, a raw MMSE and a forward MMSE estimation, over additive white Gaussian noise (AWGN) channels. These techniques are also adapted to bursty channels. Then, we propose two new mitigation methods specially suitable for bursty channels. The first one is based on a forward-backward MMSE estimation and the second one on the well- known Viterbi algorithm. Different experiments are carried out, dealing with several issues such as the application of hard decisions on the received bits or the influence of the estimated channel SNR. The experimental results show that the HMM-based techniques can effectively mitigate channel errors, even in very poor channel conditions. 2003 Elsevier B.V. All rights reserved.

[1]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[2]  Joachim Hagenauer,et al.  A Viterbi algorithm with soft-decision outputs and its applications , 1989, IEEE Global Telecommunications Conference, 1989, and Exhibition. 'Communications Technology for the 1990s and Beyond.

[3]  José L. Pérez-Córdoba,et al.  HMM-based methods for channel error mitigation in distributed speech recognition , 2002, INTERSPEECH.

[4]  Peter Vary,et al.  Softbit speech decoding: a new approach to error concealment , 2001, IEEE Trans. Speech Audio Process..

[5]  José L. Pérez-Córdoba,et al.  MMSE-based channel error mitigation for distributed speech recognition , 2001, INTERSPEECH.

[6]  Mikael Skoglund,et al.  Vector quantization over a noisy channel using soft decision decoding , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[7]  David Pearce,et al.  The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions , 2000, INTERSPEECH.

[8]  P. K. Chaturvedi,et al.  Communication Systems , 2002, IFIP — The International Federation for Information Processing.

[9]  Darren Pearce,et al.  Enabling new speech driven services for mobile devices: An overview of the ETSI standards activities , 2000 .

[10]  William H. Tranter,et al.  The performance of Reed-Solomon codes on a bursty-noise channel , 1995, IEEE Trans. Commun..

[11]  Abeer Alwan,et al.  Joint channel decoding - Viterbi recognition for wireless applications , 2001, INTERSPEECH.

[12]  Alexandros Potamianos,et al.  Soft-feature decoding for speech recognition over wireless channels , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[13]  W. Turin,et al.  Statistical Methods for Speech Transmission Using Hidden Markov Models , 1997 .

[14]  Nariman Farvardin,et al.  Joint design of block source codes and modulation signal sets , 1992, IEEE Trans. Inf. Theory.

[15]  B. Milner Robust speech recognition in burst-like packet loss , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[16]  Mari Ostendorf,et al.  Graceful degradation of speech recognition performance over lossy packet networks , 2001, INTERSPEECH.

[17]  C.G. Gerlach A probabilistic framework for optimum speech extrapolation in digital mobile radio , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[18]  A. B.,et al.  SPEECH COMMUNICATION , 2001 .

[19]  01 New Aurora Activity for Standardization of a Front-End Extension for Tonal Language Recognition and Speech Reconstruction , 2001 .