论文信息 - Adaptation in the pronunciation space for non-native speech recognition

Adaptation in the pronunciation space for non-native speech recognition

We introduce a new technique to improve the recognition of non-native speech. The underlying assumption is that for each non-native pronunciation of a speech sound, there is at least one sound in the target language that has a similar native pronunciation. The adaptation is performed by HMM interpolation between adequate native acoustic models. The interpolation partners are determined automatically in a data-driven manner. Our experiments show that this technique is suitable for both the offline adaptation to a whole group of speakers as well as for the unsupervised online adaptation to a single speaker. Results are given both for spontaneous non-native English speech as well as for a set of read non-native German utterances.

[1] Elmar Nöth,et al. The Utility of Semantic-Pragmatic Information and Dialogue-State for Speech Recognition in Spoken Dialogue Systems , 2000, TSD.

[2] Ian Maddieson,et al. Patterns of sounds , 1986 .

[3] James R. Glass,et al. Lexical modeling of non-native speech for automatic speech recognition , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[4] Dirk V. Campernolle. Speech Recognition by Goats, Wolves, Sheep and Non-Natives , 2000 .

[5] Isabel Trancoso,et al. Recognition of non-native accents , 1997, EUROSPEECH.

[6] Laura Mayfield Tomokiyo,et al. Recognizing Non-Native Speech: Characterizing and Adapting to Non-Native Usage in LVCSR , 2001 .

[7] Georg Stemmer. Modeling variability in speech recognition , 2004 .

[8] Elmar Nöth,et al. Improving Children's Speech Recognition by HMM Interpolation with an Adults' Speech Recognizer , 2003, DAGM-Symposium.

[9] R. Schwartz,et al. Maximum a posteriori adaptation for large scale HMM recognizers , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[10] Helmer Strik,et al. Modeling pronunciation variation for ASR: A survey of the literature , 1999, Speech Commun..

[11] J. Bellegarda. An Overview of Statistical Language Model Adaptation , 2001 .

[12] James R. Glass,et al. Telephone-based conversational speech recognition in the JUPITER domain , 1998, ICSLP.