Design of a speech coder utilizing speech recognition parameters for server-based wireless speech recognition

The existing standard speech coders have good speech communications quality; however, it is known that they degrade the performance of automatic speech recognition (ASR) systems. that are deployed for wireless communications as a server-based approach. The paper proposes a speech coder that utilizes speech recognition parameters for wireless speech recognition. To maintain the performance of ASR as in conventional ASR systems implemented in a client-based approach, the proposed speech coder first extracts Mel-frequency cepstral coefficients (MFCC) that are the typical recognition parameters, and then converts them into linear prediction coefficients for CELP-type speech coding. By transmitting MFCC directly to the decoder, an ASR system employing the proposed speech coder can provide even better performance than that using standard CELP-type speech coders.

[1]  M. J. Hunt An examination of three classes of ASR dialogue systems: PC-based dictation, in-car systems and automated directory assistance , 2001, IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01..

[2]  David Pearce,et al.  The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions , 2000, INTERSPEECH.

[3]  Chafic Mokbel,et al.  Towards improving ASR robustness for PSN and GSM telephone applications , 1997, Speech Commun..

[4]  Hong Kook Kim,et al.  A bitstream-based front-end for wireless speech recognition on IS-136 communications system , 2001, IEEE Trans. Speech Audio Process..

[5]  Philip Lockwood,et al.  Evaluation of root-normalised front-end (RN LFCC) for speech recognition in wireless GSM network environments , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[6]  Vassilios Digalakis,et al.  Robust speech recognition for multiple topological scenarios of the GSM mobile phone system , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[7]  José L. Pérez-Córdoba,et al.  Low complexity channel error mitigation for distributed speech recognition over wireless channels , 2003, IEEE International Conference on Communications, 2003. ICC '03..

[8]  Chafic Mokbel,et al.  Adapting PSN recognition models to the GSM environment by using spectral transformation , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[9]  Alexandros Potamianos,et al.  A codec for speech recognition in a wireless system , 2000, IEEE/AFCEA EUROCOMM 2000. Information Systems for Enhanced Public Safety and Security (Cat. No.00EX405).

[10]  O. Viikki,et al.  ASR in portable wireless devices , 2001, IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01..

[11]  Robert M. Gray,et al.  An Algorithm for Vector Quantizer Design , 1980, IEEE Trans. Commun..