论文信息 - Vocabulary Decomposition for Estonian Open Vocabulary Speech Recognition

Vocabulary Decomposition for Estonian Open Vocabulary Speech Recognition

Speech recognition in many morphologically rich languages suffers from a very high out-of-vocabulary (OOV) ratio. Earlier work has shown that vocabulary decomposition methods can practically solve this problem for a subset of these languages. This paper compares various vocabulary decomposition approaches to open vocabulary speech recognition, using Estonian speech recognition as a benchmark. Comparisons are performed utilizing large models of 60000 lexical items and smaller vocabularies of 5000 items. A large vocabulary model based on a manually constructed morphological tagger is shown to give the lowest word error rate, while the unsupervised morphology discovery method Morfessor Baseline gives marginally weaker results. Only the Morfessor-based approach is shown to adequately scale to smaller vocabulary sizes.

Mikko Kurimo | Antti Puurula

[1] Kimmo Koskenniemi,et al. A General Computational Model for Word-Form Recognition and Production , 1984 .

[2] Einar Meister,et al. Methods for Estonian Large Vocabulary Speech Recognition , 2006 .

[3] Mikko Kurimo,et al. Unlimited vocabulary speech recognition with morph language models applied to Finnish , 2006, Comput. Speech Lang..

[4] Comrie Bernard. Language Universals and Linguistic Typology , 1982 .

[5] Andreas Stolcke,et al. Morphology-based language modeling for conversational Arabic speech recognition , 2006, Comput. Speech Lang..

[6] Janne Pylkkönen. AN EFFICIENT ONE-PASS DECODER FOR FINNISH LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION , .

[7] Ebru Arisoy,et al. Unlimited vocabulary speech recognition for agglutinative languages , 2006, NAACL.

[8] Ebru Arisoy,et al. Analysis of Morph-Based Speech Recognition and the Modeling of Out-of-Vocabulary Words Across Languages , 2007, HLT-NAACL.

[9] Hermann Ney,et al. Open vocabulary speech recognition with flat hybrid models , 2005, INTERSPEECH.

[10] Vesa Siivola,et al. Growing an n-gram language model , 2005, INTERSPEECH.