Improving mispronunciation detection and diagnosis of learners' speech with context-sensitive phonological rules based on language transfer

This study demonstrates how knowledge of language transfer can enable a computer-assisted pronunciation teaching (CAPT) system to effectively detect and diagnose salient mispronunciations in second language learners’ speech. Our approach uses a HMM-based speech recognizer with an extended pronunciation lexicon that includes both a model pronunciation for each word and common pronunciation variants of our target learners. The pronunciation variants in the extended pronunciation lexicon are generated based on language transfer theory (i.e knowledge from the first language is transferred to the second language). We find that a lexicon that characterizes language transfer using context-sensitive phonological rules can detect and diagnose errors better than a lexicon generated from contextinsensitive rules. Furthermore, predicting errors from language transfer alone can approach the performance of a system where the lexicon is fully-informed of all possible pronunciation errors.

[1]  Helmer Strik,et al.  ASR corrective feedback on pronunciation: Does it really work? , 2006 .

[2]  In-Seok Kim,et al.  Automatic Speech Recognition: Reliability and Pedagogical Implications for Teaching Pronunciation , 2006, J. Educ. Technol. Soc..

[3]  Helmer Strik,et al.  ASR-based corrective feedback on pronunciation: does it really work? , 2006, INTERSPEECH.

[4]  R. Lado,et al.  Linguistics across Cultures: Applied Linguistics for Language Teachers , 1958 .

[5]  Paul Boersma,et al.  Praat: doing phonetics by computer , 2003 .

[6]  Dan Gusfield,et al.  Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[7]  G. L. Trager,et al.  Linguistics across cultures , 1957 .

[8]  W. Fisher,et al.  An acoustic‐phonetic data base , 1987 .

[9]  Kristin Precoda,et al.  The SRI EduSpeak System: Recognition and Pronunciation Scoring for Language Learning , 2007 .

[10]  Timo Hanschmann,et al.  Validation Report , 2004 .

[11]  Dan Gusfield Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[12]  Farzad Ehsani,et al.  Speech Technology in Computer-Assisted Language Learning: Strengths and Limitations of a New CALL Paradigm. , 1998 .

[13]  R. Lado,et al.  Linguistics Across Cultures: Applied Linguistics for Language Teachers , 1957 .

[14]  Krystyna A. Wachowicz,et al.  Software That Listens: It's Not a Question of Whether, It's a Question of How , 1999 .

[15]  Yuen Yee Lo,et al.  Deriving salient learners’ mispronunciations from cross-language phonological comparisons , 2007, 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU).

[16]  Martha C. Pennington,et al.  Computer-Aided Pronunciation Pedagogy: Promise, Limitations, Directions* , 1999 .

[17]  Silke M. Witt,et al.  Use of speech recognition in computer-assisted language learning , 2000 .