Prosody dependent Mandarin speech recognition

In this paper, we discuss how to model and train Mandarin prosody dependent acoustic model based on automatic prosody annotation corpus. Based on prosody annotation corpus, we first utilize our proposed methods to train prosody dependent and prosody independent tonal syllable model, and then use these models to get the mixed acoustic models. In this paper, we also utilize tone model to improve the correct rate of tonal syllable through revising the tone of the tonal syllable at certain significant level. When compared with the baseline system, the performance of our proposed mixed speech recognition system improves the correct rate of tonal syllable significantly.

[1]  Andreas Stolcke,et al.  Prosodic knowledge sources for automatic speech recognition , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[2]  Xu Bo Mandarin Stress Detection Based on Complementary Model , 2011 .

[3]  Jeung-Yoon Choi,et al.  Simultaneous recognition of words and prosody in the Boston University Radio Speech Corpus , 2005, Speech Commun..

[4]  Bo Xu,et al.  Mandarin stress detection using hierarchical model based boosting classification and regression tree , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[5]  Bo Xu,et al.  Automatic Prosody Boundary Labeling of Mandarin Using Both Text and Acoustic Information , 2008, 2008 6th International Symposium on Chinese Spoken Language Processing.

[6]  Jeung-Yoon Choi,et al.  Prosody dependent speech recognition on radio news corpus of American English , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[7]  Gökhan Tür,et al.  Modeling the prosody of hidden events for improved word recognition , 1999, EUROSPEECH.

[8]  Keiichi Tokuda,et al.  Hidden Markov models based on multi-space probability distribution for pitch pattern modeling , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[9]  Keiichi Tokuda,et al.  Multi-Space Probability Distribution HMM , 2002 .

[10]  Bo Xu,et al.  Update progress of Sinohear: advanced Mandarin LVCSR system at NLPR , 2000, INTERSPEECH.