Automatic pronunciation error detection in non-native speech: the case of vowel errors in Dutch.

This research is aimed at analyzing and improving automatic pronunciation error detection in a second language. Dutch vowels spoken by adult non-native learners of Dutch are used as a test case. A first study on Dutch pronunciation by L2 learners with different L1s revealed that vowel pronunciation errors are relatively frequent and often concern subtle acoustic differences between the realization and the target sound. In a second study automatic pronunciation error detection experiments were conducted to compare existing measures to a metric that takes account of the error patterns observed to capture relevant acoustic differences. The results of the two studies do indeed show that error patterns bear information that can be usefully employed in weighted automatic measures of pronunciation quality. In addition, it appears that combining such a weighted metric with existing measures improves the equal error rate by 6.1 percentage points from 0.297, for the Goodness of Pronunciation (GOP) algorithm, to 0.236.

[1]  R. Plomp,et al.  Frequency analysis of Dutch vowels from 50 male speakers. , 1973, The Journal of the Acoustical Society of America.

[2]  E. Brennan,et al.  Accent Scaling and Language Attitudes: Reactions to Mexican American English Speech , 1981 .

[3]  J. Flege A Critical Period for Learning to Pronounce Foreign Languages , 1987 .

[4]  K. Stevens,et al.  Linguistic experience alters phonetic perception in infants by 6 months of age. , 1992, Science.

[5]  S. Cessie,et al.  Ridge Estimators in Logistic Regression , 1992 .

[6]  W. Strange Speech perception and linguistic experience : issues in cross-language research , 1995 .

[7]  G. Booij The Phonology of Dutch , 1995 .

[8]  P. Kuhl,et al.  Linguistic experience and the "perceptual magnet effect." , 1995 .

[9]  A. Meltzoff,et al.  Infant vocalizations in response to speech: vocal imitation and developmental change. , 1996, The Journal of the Acoustical Society of America.

[10]  Roger K. Moore,et al.  Handbook of standards and resources for spoken language systems , 1997 .

[11]  Rosina L. Lippi-Green English with an Accent: Language, Ideology and Discrimination in the United States , 1997 .

[12]  Rosina Lippi English with an Accent: Language, Ideology and Discrimination in the United States , 1997 .

[13]  J. Flege Age of learning and second language speech. , 1999 .

[14]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[15]  Silke M. Witt,et al.  Use of speech recognition in computer-assisted language learning , 2000 .

[16]  Steve J. Young,et al.  Phone-level pronunciation scoring and assessment for interactive language learning , 2000, Speech Commun..

[17]  W. Elliott Maturational Constraints on Language Development , 2000 .

[18]  Daniel O. Jackson,et al.  Second Language Acquisition and the Critical Period Hypothesis , 2000 .

[19]  Vassilios Digalakis,et al.  Combination of machine scores for automatic grading of pronunciation quality , 2000, Speech Commun..

[20]  I. R. MacKay,et al.  Category restructuring during second-language speech acquisition. , 2001, The Journal of the Acoustical Society of America.

[21]  C. Best,et al.  Discrimination of non-native consonant contrasts varying in perceptual assimilation to the listener's native phonological system. , 2001, The Journal of the Acoustical Society of America.

[22]  Nelleke Oostdijk,et al.  The Design of the Spoken Dutch Corpus , 2002 .

[23]  M. Young-Scholten Orthographic input in L2 phonological development , 2002 .

[24]  Y. Tohkura,et al.  A perceptual interference account of acquisition difficulties for non-native phonemes , 2003, Cognition.

[25]  P. Boersma,et al.  BRIDGING THE GAP BETWEEN L2 SPEECH PERCEPTION RESEARCH AND PHONOLOGICAL THEORY , 2004, Studies in Second Language Acquisition.

[26]  Roel Smits,et al.  A comparison of vowel normalization procedures for language variation research. , 2004, The Journal of the Acoustical Society of America.

[27]  D. Burnham,et al.  The Role of Audiovisual Speech and Orthographic Information in Nonnative Speech Production , 2005 .

[28]  B. Bassetti Orthographic input and phonological representations in learners of Chinese as a foreign language , 2006 .

[29]  Helmer Strik,et al.  Selecting segmental errors in non-native Dutch for optimal pronunciation training , 2006 .

[30]  J. Flege Second Language Speech Learning Theory , Findings , and Problems , 2006 .

[31]  P. Iverson,et al.  Learning English vowels with different first-language vowel systems: perception of formant targets, formant movement, and duration. , 2007, The Journal of the Acoustical Society of America.

[32]  R. van Hout,et al.  An acoustic description of the vowels of northern and southern standard Dutch II: regional varieties. , 2007, The Journal of the Acoustical Society of America.

[33]  Susan Robinson Dutch , 2007, Cheers!.

[34]  John R. Hershey,et al.  Approximating the Kullback Leibler Divergence Between Gaussian Mixture Models , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[35]  Patrick Wambacq,et al.  SPRAAK: an open source "SPeech recognition and automatic annotation kit" , 2008, INTERSPEECH.

[36]  Hugo Van hamme,et al.  Recording Speech of Children, Non-Natives and Elderly People for HLT Applications: the JASMIN-CGN Corpus , 2008, LREC.

[37]  Anne Cutler,et al.  Supervised and unsupervised learning of multidimensionally varying non-native speech categories , 2008, Speech Commun..

[38]  Yu Hu,et al.  A new method for mispronunciation detection using Support Vector Machine based on Pronunciation Space Models , 2009, Speech Commun..

[39]  Helmer Strik,et al.  Oral proficiency training in Dutch L2: The contribution of ASR-based corrective feedback , 2009, Speech Commun..

[40]  Joost van Doremalen,et al.  Automatic detection of vowel pronunciation errors using multiple information sources , 2009, 2009 IEEE Workshop on Automatic Speech Recognition & Understanding.

[41]  Helmer Strik,et al.  Comparing different approaches for automatic pronunciation error detection , 2009, Speech Commun..

[42]  Helmer Strik,et al.  The goodness of pronunciation algorithm: a detailed performance study , 2009, SLaTE.

[43]  Mark Hasegawa-Johnson,et al.  Landmark-based automated pronunciation error detection , 2010, INTERSPEECH.

[44]  Helmer Strik,et al.  Phoneme Errors in Read and Spontaneous Non-Native Speech: Relevance for CAPT System Development , 2010 .

[45]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .