论文信息 - POS-based language models for large vocabulary speech recognition on embedded systems

POS-based language models for large vocabulary speech recognition on embedded systems

Speech recognition on embedded systems requires components of low memory footprint and low computational complexity. In this paper a POS-based (part of speech based) language modeling approach is presented which decreases the number of language model parameters combined with a method for reducing memory consumptions via quantization of language model penalties. For the application of short message dictation a language model with about 10,000 words of vocabulary is generated. Using the POS-based language modeling approach the number of parameters comprises 70,058 penalties. The memory consumptions for storing those penalties are reduced about 50% using the presented coding method. Experiments show that the POS-based language model is able to reduce the WER up to 65% for n-best isolated word recognition in comparison to the case without language model. Moreover the increase of WER caused by coding of the language model penalties is not significant.

Josef G. Bauer | Harald Höge | Sergey Astrov | Petra Witschel | Gabriele Bakenecker

[1] Sergey Astrov,et al. Memory space reduction for hidden Markov models in low-resource speech recognition systems , 2002, INTERSPEECH.

[2] Petra Witschel,et al. Constructing linguistic oriented language models for large vocabulary speech recognition , 1993, EUROSPEECH.

[3] Bernt Andrassy,et al. Large vocabulary speaker independent isolated word recognition for embedded systems , 2003, INTERSPEECH.

[4] Slava M. Katz,et al. Estimation of probabilities from sparse data for the language model component of a speech recognizer , 1987, IEEE Trans. Acoust. Speech Signal Process..

[5] Harald Höge,et al. Experiments in adaptation of language models for commercial applications , 1997, EUROSPEECH.