论文信息 - Open vocabulary speech recognition with flat hybrid models

Open vocabulary speech recognition with flat hybrid models

Today’s speech recognition systems are able to recognize arbitrary sentences over a large but finite vocabulary. However, many important speech recognition tasks feature an open, constantly changing vocabulary. (E.g. broadcast news transcription, translation of political debates, etc. Ideally, a system designed for such open vocabulary tasks would be able to recognize arbitrary, even previously unseen words. To some extent this can be achieved by using sub-lexical language models. We demonstrate that, by using a simple flat hybrid model, we can significantly improve a well-optimized state-ofthe-art speech recognition system over a wide range of out-of-vocabulary rates.

Hermann Ney | Maximilian Bisani | H. Ney | M. Bisani

[1] Frédéric Bimbot,et al. Variable-length sequence matching for phonetic transcription using joint multigrams , 1995, EUROSPEECH.

[2] Dietrich Klakow,et al. OOV-detection in large vocabulary system using automatically defined word-fragments as fillers , 1999, EUROSPEECH.

[3] James F. Allen,et al. Pronunciation of proper names with a joint n-gram model for bi-directional grapheme-to-phoneme conversion , 2002, INTERSPEECH.

[4] Hermann Ney,et al. Investigations on joint-multigram models for grapheme-to-phoneme conversion , 2002, INTERSPEECH.

[5] James Glass,et al. Modelling out-of-vocabulary words for robust speech recognition , 2002 .

[6] Lucian Galescu. Recognition of out-of-vocabulary words with sub-lexical language models , 2003, INTERSPEECH.

[7] Stanley F. Chen,et al. Conditional and joint models for grapheme-to-phoneme conversion , 2003, INTERSPEECH.

[8] Hermann Ney,et al. Bootstrap estimates for confidence intervals in ASR performance evaluation , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[9] Murat Saraclar,et al. Hybrid language models for out of vocabulary word detection in large vocabulary conversational speech recognition , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[10] Hermann Ney,et al. Investigations on error minimizing training criteria for discriminative training in automatic speech recognition , 2005, INTERSPEECH.