Recognition from GSM digital speech

This paper addresses the problem of speech rec ognition in the GSM environment. In this context, new sources of distortion, such as transmission errors or speech coding itself, significantly degrade the performance of speech rec ognizers. While conventional approaches deal with these types of distortion after decoding speech, we pr opose to recognize from the digital speech representation of GSM. In particular, our work focuses on the 13 kbit/s RPE-LTP GSM standard speech coder. In order to test our recognizer we have compared it to a conventional recognizer in several simulated situations, which allow us to gain insight into more practical ones. Specifically, besides recognizing from clean digital sp eech and evaluating the influence of speech coding distortion, the pr oposed recognizer is f aced with speech degraded by ra ndom errors, burst errors and frame substitutions. The results are very encouraging: the worse the transmission conditions are, the more recognizing from digital sp eech outperforms the conventional approach.

[1]  Philip Lockwood,et al.  Evaluation of root-normalised front-end (RN LFCC) for speech recognition in wireless GSM network environments , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[2]  Stephan Euler,et al.  The influence of speech coding algorithms on automatic speech recognition , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[3]  Kuldip K. Paliwal,et al.  Effect of speech coders on speech recognition performance , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[4]  Vassilios Digalakis,et al.  Robust speech recognition for multiple topological scenarios of the GSM mobile phone system , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[5]  Karl Hellwig,et al.  A regular-pulse excited linear predictive codec , 1988, Speech Commun..

[6]  Shu Lin,et al.  Error control coding : fundamentals and applications , 1983 .

[7]  Chafic Mokbel,et al.  Towards improving ASR robustness for PSN and GSM telephone applications , 1997, Speech Commun..