Joint channel decoding - Viterbi recognition for wireless applications

We introduce the concept of joint channel decoding and Viterbi recognition, by which the Viterbi recognizer is modified to take into account the confidence in the decoded feature after channel transmission. We present a metric for evaluating such confidence based on soft decision decoding. As a case study, we quantize MFCCs using predictive VQ. The overall sourcechannel coding scheme operating at a combined rate of 1 kbps is shown to provide good recognition accuracy over a wide range of Rayleigh fading channels.

[1]  Francisco J. Valverde-Albacete,et al.  Avoiding distortions due to speech coding and transmission errors in GSM ASR tasks , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[2]  Hong Kook Kim,et al.  Bitstream-based feature extraction for wireless speech recognition , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[3]  Ponani S. Gopalakrishnan,et al.  Compression of acoustic features for speech recognition in network environments , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[4]  Abeer Alwan,et al.  Source and channel coding for remote speech recognition over error-prone channels , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[5]  Vassilios Digalakis,et al.  Product-code vector quantization of cepstral parameters for speech recognition over the WWW , 1998, ICSLP.

[6]  Philip Lockwood,et al.  Evaluation of root-normalised front-end (RN LFCC) for speech recognition in wireless GSM network environments , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[7]  Mark Hasegawa-Johnson,et al.  PLP coefficients can be quantized at 400 bps , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[8]  Chafic Mokbel,et al.  Solutions for robust recognition over the GSM cellular network , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[9]  Vassilios Digalakis,et al.  Robust speech recognition for multiple topological scenarios of the GSM mobile phone system , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[10]  Abeer Alwan,et al.  An efficient and scalable 2D DCT-based feature coding scheme for remote speech recognition , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[11]  Vassilios Digalakis,et al.  Quantization of cepstral parameters for speech recognition over the World Wide Web , 1999, IEEE J. Sel. Areas Commun..