Text normalization and speech recognition in French

In this paper we present a quantitative investigation into the impact of text normalization on lexica and language models for speech recognition in French. The text normalization process defines what is considered to be a word by the recognition system. Depending on this definition we can measure different lexical coverages and language model perplexities, both of which are closely related to the speech recognition accuracies obtained on read newspaper texts. Different text normalizations of up to 185M words of newspaper texts are presented along with corresponding lexical coverage and perplexity measures. Some normalizations were found to be necessary to achieve good lexical coverage, while others were more or less equivalent in this regard. The choice of normalization to create language models for use in the recognition experiments with read newspaper texts was based on these findings. Our best system configuration obtained a 11.2% word error rate in the AUPELF ‘French-speaking’ speech recognizer evaluation test held in February 1997.

[1]  Janet M. Baker,et al.  The Design for the Wall Street Journal-based CSR Corpus , 1992, HLT.

[2]  Michèle Jardino Multilingual stochastic n-gram class language models , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[3]  Philip C. Woodland,et al.  Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models , 1995, Comput. Speech Lang..

[4]  David A. van Leeuwen,et al.  Multilingual large vocabulary speech recognition: the European SQALE project , 1997, Comput. Speech Lang..

[5]  Lori Lamel,et al.  Issues in Large Vocabulary, Multilingual Speech Recognition , 1995, EUROSPEECH.

[6]  Lori Lamel,et al.  Developments in large vocabulary, continuous speech recognition of German , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[7]  Jean-Luc Gauvain,et al.  Developments in continuous speech dictation using the 1995 ARPA NAB news task , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[8]  Maxine Eskénazi,et al.  BREF, a large vocabulary spoken corpus for French , 1991, EUROSPEECH.

[9]  Jean-Luc Gauvain,et al.  Speech-To-Text Conversion in French , 1994, Int. J. Pattern Recognit. Artif. Intell..

[10]  Lori Lamel,et al.  Speaker-independent continuous speech dictation , 1993, Speech Communication.

[11]  Max Silberztein,et al.  Dictionnaires électroniques et analyse automatique de textes : le système intex , 1993 .