1000-word speaker-independent continuous-speech recognition using hidden Markov models

An algorithm based on hidden Markov models is applied to the task of speaker-independent continuous-speech recognition for a vocabulary of 1000 words with no syntactic constraints. The signal is limited to 4000 Hz. Word models were built from three-state representations of phonetic units, concatenated according to entries in a lexicon. Performance as measured on DARPAs resource management database was 40% correct word recognition. It was found that the use of several different acoustic features and the use of word-specific phonetic modeling, where possible, improved system performance.<<ETX>>

[1]  John Makhoul,et al.  BYBLOS: The BBN continuous speech recognition system , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  L. R. Rabiner,et al.  Recognition of isolated digits using hidden Markov models with continuous mixture densities , 1985, AT&T Technical Journal.

[3]  L. R. Rabiner,et al.  An introduction to the application of the theory of probabilistic functions of a Markov process to automatic speech recognition , 1983, The Bell System Technical Journal.

[4]  Lalit R. Bahl,et al.  Experiments with the Tangora 20,000 word speech recognizer , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  Stan Davis,et al.  Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .

[6]  D. B. Paul A speaker-stress resistant HMM isolated word recognizer , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[7]  R. Gray,et al.  Vector quantization , 1984, IEEE ASSP Magazine.

[8]  S. Roucos,et al.  The role of word-dependent coarticulatory effects in a phoneme-based speech recognition system , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[9]  Patti Price,et al.  The DARPA 1000-word resource management database for continuous speech recognition , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[10]  Stephen E. Levinson,et al.  Continuously variable duration hidden Markov models for speech analysis , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[11]  Lalit R. Bahl,et al.  A Maximum Likelihood Approach to Continuous Speech Recognition , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.