A class based language model for speech recognition

Class based language models are often used when there is insufficient data to generate a word based language model directly from the training data. In this approach, similar items are clustered into classes, an n-gram language model for the class tokens is generated, and then the probabilities for words in a class are distributed according to the smoothed relative unigram frequencies of the words. Classes expand to lists of single word tokens, that is, a class cannot represent a sequence of lexical tokens. We propose a more general mechanism for defining a language model class. In it, classes are expanded to word sequences through finite-state networks. This allows expansion to word sequences without requiring compound words in the lexicon. Where finite-state models are too brittle to represent sentence-level strings, they can represent class-level strings (dates, names and titles for example). We compared the perplexity on the ARPA Dec93 ATIS Test set and found that the new model reduced the perplexity by approximately 17 percent (relative).

[1]  Pascale Fung,et al.  The estimation of powerful language models from small and large corpora , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  Wayne Ward,et al.  Flexible use of semantic constraints in speech recognition , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.