CMU robust vocabulary-independent speech recognition system

Efforts to improve the performance of CMU's robust vocabulary-independent (VI) speech recognition systems on the DARPA speaker-independent resource management task are discussed. The improvements are evaluated on 320 sentences randomly selected from the DARPA June 88, February 89, and October 89 test sets. The first improvement involves more detailed acoustic modeling. The authors incorporated more dynamic features computed from the LPC cepstra and reduced error by 15% over the baseline system. The second improvement comes from a larger database. With more training data, the third improvement comes from a more detailed subword modeling. The authors incorporated the word boundary context into their VI subword modeling and it resulted in a 30% error reduction. Decision-tree allophone clustering was used to find more suitable models for the subword units not covered in the training set and further reduced error by 17%.<<ETX>>

[1]  Hsiao-Wuen Hon,et al.  On vocabulary-independent speech modeling , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[2]  Mei-Yuh Hwang,et al.  Improved Hidden Markov Modeling for Speaker-Independent Continuous Speech Recognition , 1990, HLT.

[3]  Hsiao-Wuen Hon,et al.  An overview of the SPHINX speech recognition system , 1990, IEEE Trans. Acoust. Speech Signal Process..

[4]  Hsiao-Wuen Hon,et al.  Allophone clustering for continuous speech recognition , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[5]  Hsiao-Wuen Hon,et al.  Towards Speech Recognition Without Vocabulary-Specific Training , 1989, HLT.

[6]  Michael Picheny,et al.  Large vocabulary natural language continuous speech recognition , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[7]  Lalit R. Bahl,et al.  A tree-based statistical language model for natural language speech recognition , 1989, IEEE Trans. Acoust. Speech Signal Process..

[8]  Shigeki Sagayama,et al.  Phoneme environment clustering for speech recognition , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[9]  Sadaoki Furui,et al.  Speaker-independent isolated word recognition using dynamic features of speech spectrum , 1986, IEEE Trans. Acoust. Speech Signal Process..

[10]  Mei-Yuh Hwang,et al.  Improved acoustic modeling with the SPHINX speech recognition system , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.