The research of Mongolian speech recognition technology start comparatively late and it is still in its primary stage. In this paper, we optimized the basic resources of Mongolian speech recognition system, and we also improved the acoustic model of Mongolian speech recognition system, and this is most important. In this paper, we realized continuous HMM Gaussian mixture model and multiple data stream SCHMM model on the basis of context dependent phonetic model and decision tree method. And we compared the two models in performances. Finally, a large quantity of experiments have been taken to the testing set with HTK as an experimental platform by applying trigram language model and acoustic model which is composed of context dependent phonetic model, decision tree method and multiple data stream SCHMM model. We found system performance has been optimized, and system recognition accuracy rates of word and sentence have been greatly improved.
[1]
B. Juang,et al.
Context-dependent Phonetic Hidden Markov Models for Speaker-independent Continuous Speech Recognition
,
2008
.
[2]
Kai-Fu Lee,et al.
Context-independent phonetic hidden Markov models for speaker-independent continuous speech recognition
,
1990
.
[3]
Hervé Bourlard,et al.
New entropy based combination rules in HMM/ANN multi-stream ASR
,
2003,
2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..
[4]
Guang-Lai Gao,et al.
Researching of Speech Recognition Oriented Mongolian Acoustic Model
,
2008,
2008 Chinese Conference on Pattern Recognition.