Mixed Models Based Pronunciation Evaluation of Mandarin Tone

Under the framework of traditional pronunciation evaluation system, we try to extract discrete and continuous fundamental frequency features by different methods, use the fundamental frequency features respectively at the speech frame level and syllable level to set up embedded tone models, explicit tone models and their mixed models for the pronunciation evaluation of Mandarin tone, and then compare the influence of different methods and models on the evaluation performance. The results show that the performance of mixed models is the best. The average score error rate (ASER) of mixed models is 0.249 with a relative 29.66% reduction in contrast with the baseline system.

[1]  Sin-Horng Chen,et al.  Tone recognition of continuous Mandarin speech based on neural networks , 1995, IEEE Trans. Speech Audio Process..

[2]  Jyh-Shing Roger Jang,et al.  TRUES: Tone Recognition Using Extended Segments , 2008, TALIP.

[3]  Lawrence R. Rabiner,et al.  On the use of autocorrelation analysis for pitch detection , 1977 .

[4]  Hongyu Wang,et al.  A Novel Error Resilient Scheme for Wavelet-based Image Coding Over Packet Networks , 2012, J. Networks.

[5]  Keiichi Tokuda,et al.  Multi-Space Probability Distribution HMM , 2002 .

[6]  Yin Jun-xun,et al.  Objective Evaluation of the Neutral Tone in Putonghua Proficiency Testing , 2006 .

[7]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[8]  Yin Jun-xun Objective Evaluation of Putonghua Tones , 2007 .

[9]  Frank K. Soong,et al.  Automatic mispronunciation detection for Mandarin , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[10]  Bo Xu,et al.  Decision tree based Mandarin tone model and its application to speech recognition , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[11]  Ren-Hua Wang,et al.  CDF-Matching for Automatic Tone Error Detection in Mandarin Call System , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[12]  Gang Peng,et al.  Tone recognition of continuous Cantonese speech based on support vector machines , 2005, Speech Commun..

[13]  Keikichi Hirose,et al.  Anchoring hypothesis and its application to tone recognition of Chinese continuous speech , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[14]  M. Ross,et al.  Average magnitude difference function pitch extractor , 1974 .

[15]  Liu Runsheng Phone-based pronunciation quality assessment algorithm , 2005 .

[16]  Wang Ren-hua The Electronic PSC Testing System , 2006 .

[17]  Yan Yonghong Experimental investigation of Putonghua pronunciation quality assessment system , 2010 .

[18]  Ren-Hua Wang,et al.  Tone Evaluation of Chinese Continuous Speech Based on Prosodic Words , 2008, 2008 6th International Symposium on Chinese Spoken Language Processing.

[19]  Frank K. Soong,et al.  Tone-enhanced generalized character posterior probability (GCPP) for Cantonese LVCSR , 2008, Comput. Speech Lang..