Analysis of a simple bipos language model-attempt at a strategy to improve language models for speech recognition

A speech recognizer has to choose, at each point in the utterance, the words among all the words in the vocabulary, that are the most likely. To that end, it uses an acoustic model and a language model and the author focuses on the language model. The bipos model is presented and analysed. A method is introduced called probability decomposition to measure which part of the model is performing particularly well or poorly. Based on this analysis, the author modifies the modeling of unknown words and this leads to a reduction in the entropy of at least 14% (up to 21%). Other conclusions obtained from the analysis are also given. An attempt at a strategy to improve language models in general is given. To that end, the author defines a class of models called state language models. This class contains most currently employed models. However, these currently used models cover only a small area in the space of all possible state language models. A more systematic study of this space is proposed in order to improve current language models. A statistical method, called classification and regression trees is presented as a tool for this purpose. >