A stochastic language model for speech recognition integrating local and global constraints

This paper describes a speech recognition system that uses a new stochastic language model that integrates local and global constraints. Dependencies within adjacent words are used as local constraints in the same way as in conventional word N-gram models. To capture the global constraints between non-contiguous words, the sequence of the function words and that of the content words are taken into account. Furthermore, it is shown that, assuming an independence between local- and global constraints, the number of parameters to be estimated and stored is greatly reduced. The proposed language model is incorporated into a speech recognizer based on the time-synchronous Viterbi algorithm, and compared with the word bigram model and trigram model. The experimental results show that the proposed method is able to capture linguistic constraints effectively.<<ETX>>