A preliminary study on ASR-based detection of Chinese mispronunciation by Japanese learners

Detecting mispronunciations produced by non-native speakers and providing detailed instructive feedbacks are desired in computer assisted pronunciation training system (CAPT), as it is helpful to L2 learners to improve their pronunciation more effectively. In this paper, we present our preliminary study on detecting phonetic segmental mispronunciations on account of the erroneous articulation tendencies, including the place of articulation and the manner of articulation. Through modeling and detecting these error patterns, feedbacks based on articulation-placement and articulation-manner could be given. Moreover, Japanese learners of Chinese are focused on in this study. The experimental results show that the approach can detect the mostly representative pronunciation errors moderately well, achieving a false rejection rate of 8.0% and a false acceptance rate 32.6%. The diagnostic accuracy is 86.0%.

[1]  Keikichi Hirose,et al.  Automatic Chinese pronunciation error detection using SVM trained with structural features , 2012, 2012 IEEE Spoken Language Technology Workshop (SLT).

[2]  Wai Kit Lo,et al.  Implementation of an extended recognition network for mispronunciation detection and diagnosis in computer-assisted pronunciation training , 2009, SLaTE.

[3]  Steve J. Young,et al.  Phone-level pronunciation scoring and assessment for interactive language learning , 2000, Speech Commun..

[4]  Tatsuya Kawahara,et al.  Practical use of English pronunciation system for Japanese students in the CALL classroom , 2004, INTERSPEECH.

[5]  Jinsong Zhang,et al.  Developing a Chinese L2 speech database of Japanese learners with narrow-phonetic labels for computer assisted pronunciation training , 2010, INTERSPEECH.

[6]  Frank K. Soong,et al.  Automatic mispronunciation detection for Mandarin , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[7]  Frank K. Soong,et al.  Generalized Segment Posterior Probability for Automatic Mandarin Pronunciation Evaluation , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[8]  Alissa M. Harrison,et al.  Development of Automatic Speech Recognition and Synthesis Technologies to Support Chinese Learners of English : The CUHK Experience Development of Automatic Speech Recognition and Synthesis Technologies to Support Chinese Learners of English : The CUHK Experience , 2010 .

[9]  Mark Hasegawa-Johnson,et al.  Landmark-based automated pronunciation error detection , 2010, INTERSPEECH.

[10]  Wang Yunjia How Japanese learners of Chinese process the aspirated and unaspirated consonants in standard Chinese , 2004 .

[11]  Lin-Shan Lee,et al.  Toward unsupervised discovery of pronunciation error patterns using universal phoneme posteriorgram for computer-assisted language learning , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[12]  Lan Wang,et al.  Improvement of Segmental Mispronunciation Detection with Prior Knowledge Extracted from Large L2 Speech Corpus , 2011, INTERSPEECH.

[13]  Yuen Yee Lo,et al.  Deriving salient learners’ mispronunciations from cross-language phonological comparisons , 2007, 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU).

[14]  Helmer Strik,et al.  The Pedagogy-Technology Interface in Computer Assisted Pronunciation Training , 2002 .

[15]  James R. Glass,et al.  Mispronunciation detection via dynamic time warping on deep belief network-based posteriorgrams , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.