An Effective CALL System for Strongly Accented Mandarin Speech

In this paper, we investigate some specific acoustic problems of the computer assisted language learning (CALL) system by modifying the acoustic model and feature under the speech recognition framework. At first, in order to alleviate the distortion of channel and speaker, speaker-dependent Cepstrum Mean Normalization (Speaker CMN) is adopted, by which the average correlation coefficient (ACC) between human and machine scores is improved from 78.00% to 84.14%. Then, Heteroscedastic Linear Discriminate Analysis (HLDA) is applied to enhance the discrimination ability of acoustic model, which successfully increases ACC from 84.14% to 84.62%. Additionally, HLDA can lessen the great human-machine scoring difference of speeches that have very good or too bad quality, and so leads to an increase of the correctly-rank rate from 85.59% to 90.99%. Finally, we use the technology of Maximum a Posteriori (MAP) to tune the acoustic model to match the strongly accented testing speech. As the result, ACC is improved from 84.62% to 86.57%.

[1]  Vassilios Digalakis,et al.  Combination of machine scores for automatic grading of pronunciation quality , 2000, Speech Commun..

[2]  Steve J. Young,et al.  Phone-level pronunciation scoring and assessment for interactive language learning , 2000, Speech Commun..

[3]  Tatsuya Kawahara,et al.  Practical use of English pronunciation system for Japanese students in the CALL classroom , 2004, INTERSPEECH.

[4]  Jyh-Shing Roger Jang,et al.  Automatic pronunciation assessment for Mandarin Chinese , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[5]  Mitch Weintraub,et al.  Automatic evaluation and training in English pronunciation , 1990, ICSLP.

[6]  Yoon Kim,et al.  Automatic pronunciation scoring for language instruction , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[7]  Andreas G. Andreou,et al.  Investigation of silicon auditory models and generalization of linear discriminant analysis for improved speech recognition , 1997 .

[8]  Andreas G. Andreou,et al.  Heteroscedastic discriminant analysis and reduced rank HMMs for improved speech recognition , 1998, Speech Commun..

[9]  Yonghong Yan,et al.  Improvements in Tone Pronunciation Scoring for Strongly Accented Mandarin Speech , 2006 .

[10]  Mitch Weintraub,et al.  Automatic text-independent pronunciation scoring of foreign language student speech , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[11]  Mitch Weintraub,et al.  Automatic scoring of pronunciation quality , 2000, Speech Commun..