Hidden-articulator Markov models for pronunciation evaluation

The design of a robust language-learning system, intended to help students practice a foreign language along with a machine tutor, must provide for localization of common pronunciation errors. This paper presents a new technique for unsupervised detection of phone-level mispronunciations, created with language-learning applications in mind. Our method uses multiple hidden-articulator Markov models to asynchronously classify acoustic events in various articulatory domains. It requires no human input besides a pronunciation dictionary for all words in the end system's vocabulary, and has been shown to perform as well as a human tutor would, given the same task. For the majority of systematic mispronunciations investigated in this study, precision in detecting the presence of an error exceeded the 70% inter-annotator agreement reported by our test corpus

[1]  Rodolfo Delmonte,et al.  SLIM prosodic module for learning activities in a foreign language , 1997, EUROSPEECH.

[2]  Wayne H. Ward,et al.  Parsing speech into articulatory events , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  Katrin Kirchhoff,et al.  Robust speech recognition using articulatory information , 1998 .

[4]  Sun-Yuan Kung,et al.  Applying articulatory features to telephone-based speaker verification , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  Tanja Schultz,et al.  Whispery speech recognition using adapted articulatory features , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[6]  Shrikanth S. Narayanan,et al.  Automatic syllable stress detection using prosodic features for pronunciation evaluation of language learners , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[7]  Simon King,et al.  Detection of symbolic gestural events in articulatory data for use in structural representations of continuous speech , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[8]  Jeff A. Bilmes,et al.  Hidden-articulator Markov models for speech recognition , 2003, Speech Commun..

[9]  Eric Atwell,et al.  The ISLE corpus: Italian and German spoken learner's English , 2003 .

[10]  Shrikanth Narayanan,et al.  Tactical Language Detection and Modeling of Learner Speech Errors: The case of Arabic tactical language training for American English speakers , 2004 .

[11]  Li Deng,et al.  An overlapping-feature-based phonological model incorporating linguistic constraints: applications to speech recognition. , 2002, The Journal of the Acoustical Society of America.

[12]  Abeer Alwan,et al.  TBALL data collection: the making of a young children's speech corpus , 2005, INTERSPEECH.