Experimental Investigation of Mandarin Pronunciation Quality Assessment System

As the most effective confidence measure in computer assisted language learning system, the posterior probability is used widely, in which some tricks are applied to reduce the computation complexity. In this paper, we analysis the defect of the traditional algorithm and propose some improvements. Firstly, the traditional algorithm adopts the method of maximum instead of sum in the calculation of the denominator, which seriously reduces the accuracy of posterior probability. Therefore, taking into account both computation complexity and system performance, we propose a novel algorithm based on phoneme confusion extended network. Secondly, in the traditional algorithm, the posterior probability is normalized by its segment time. Infact, the acoustic likelihood is more related with time and grows with the frame number. So we propose the acoustic likelihood based normalization algorithm. Experiment results show that compared to traditional algorithm, the proposed algorithm can improve system performance significantly, about 35% average score error rate relatively, and the computation complexity is hardly increased.

[1]  Yoon Kim,et al.  Automatic pronunciation scoring for language instruction , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  Mitch Weintraub,et al.  Automatic text-independent pronunciation scoring of foreign language student speech , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[3]  Mitch Weintraub,et al.  Automatic scoring of pronunciation quality , 2000, Speech Commun..

[4]  Tatsuya Kawahara,et al.  Modeling and automatic detection of English sentence stress for computer-assisted English prosody learning system , 2002, INTERSPEECH.

[5]  Sherif Abdou,et al.  INTERSPEECH 2006-ICSLP 849 g System Using Speech Recognition , 2006 .

[6]  Steve J. Young,et al.  Phone-level pronunciation scoring and assessment for interactive language learning , 2000, Speech Commun..

[7]  Tatsuya Kawahara,et al.  Practical use of English pronunciation system for Japanese students in the CALL classroom , 2004, INTERSPEECH.

[8]  Amir Najmi,et al.  Subarashii: Encounters in Japanese Spoken Language Education , 1999 .

[9]  Keikichi Hirose,et al.  A CALL system using speech recognition to train the pronunciation of Japanese long vowels, the mora nasal and mora obstruents , 1997, EUROSPEECH.

[10]  Silke M. Witt,et al.  Use of speech recognition in computer-assisted language learning , 2000 .

[11]  Jyh-Shing Roger Jang,et al.  Automatic pronunciation assessment for Mandarin Chinese , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[12]  Yonghong Yan,et al.  Improvements in Tone Pronunciation Scoring for Strongly Accented Mandarin Speech , 2006 .

[13]  Vassilios Digalakis,et al.  Combination of machine scores for automatic grading of pronunciation quality , 2000, Speech Commun..

[14]  Mitch Weintraub,et al.  Automatic evaluation and training in English pronunciation , 1990, ICSLP.