POS-based language models for large vocabulary speech recognition on embedded systems

Speech recognition on embedded systems requires components of low memory footprint and low computational complexity. In this paper a POS-based (part of speech based) language modeling approach is presented which decreases the number of language model parameters combined with a method for reducing memory consumptions via quantization of language model penalties. For the application of short message dictation a language model with about 10,000 words of vocabulary is generated. Using the POS-based language modeling approach the number of parameters comprises 70,058 penalties. The memory consumptions for storing those penalties are reduced about 50% using the presented coding method. Experiments show that the POS-based language model is able to reduce the WER up to 65% for n-best isolated word recognition in comparison to the case without language model. Moreover the increase of WER caused by coding of the language model penalties is not significant.