论文信息 - INTRODUCING MULTIPLE PRONUNCIATIONS IN SPANISH SPEECH RECOGNITION SYSTEMS

INTRODUCING MULTIPLE PRONUNCIATIONS IN SPANISH SPEECH RECOGNITION SYSTEMS

Pronunciation variations are common sources of recognition errors in real-world applications, so that specific techniques must be developed to handle them. We are describing a method to incorporate pronunciation alternatives that have been tested with both continuous and isolated word speech recognisers for Spanish. We present an automatic grapheme-tophoneme system, modified to generate alternate pronunciations. It works according to phonological rules manually developed using certain variations, well known in the linguistic community but not widely exploited in the Spanish speech recognition arena. We will apply this strategy only to the recognition stage of both a continuous speech recogniser for clean speech data, and an isolated one for a telephone environment task. We will report improvements up to 20% decrease in error rate, for the continuous speech task, while for the isolated word recognition task, no significant effect has been found. We will conclude analysing which effects have led to these results and discuss future work to be done.

Javier Ferreiros | José Manuel Pardo | Javier Macias-Guarasa | Luis Villarrubia

[1] Javier Macías Guarasa,et al. Initial evaluation of a preselection module for a flexible large vocabulary speech recognition system in telephone environment , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[2] Andreas Stolcke,et al. Multiple-pronunciation lexical modeling in a speaker independent speech understanding system , 1994, ICSLP.

[3] Christian-Michael Westendorf,et al. Learning pronunciation dictionary from speech data , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[4] Javier Macías Guarasa,et al. On the development of a dictation machine for Spanish: DIVO , 1994, ICSLP.

[5] Ronald A. Cole,et al. Automatically generated word pronunciations from phoneme classifier output , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6] Xavier L. Aubert,et al. Improved acoustic-phonetic modeling in philips' dictation system by handling liaisons and multiple pronunciations , 1995, EUROSPEECH.

[7] Pietro Laface,et al. Lexical access to large vocabularies for speech recognition , 1989, IEEE Trans. Acoust. Speech Signal Process..

[8] José Manuel Pardo,et al. Low cost speaker dependent isolated word speech preselection system using static phoneme pattern recognition , 1993, EUROSPEECH.

[9] Javier Macías Guarasa,et al. Comparison of three approaches to phonetic string generation for large vocabulary speech recognition , 1994, ICSLP.

[10] Alex Acero,et al. The VESTEL telephone speech database , 1994, ICSLP.