论文信息 - Multi-accent and accent-independent non-native speech recognition

Multi-accent and accent-independent non-native speech recognition

In this article we present a study of a multi-accent and accentindependent non-native speech recognition. We propose several approaches based on phonetic confusion and acoustic adaptation. The goal of this article is to investigate the feasibility of multi-accent non-native speech recognition without detecting the origin of the speaker. Tests on the HIWIRE corpus show that multi-accent pronunciation modeling and acoustic adaptation reduce the WER by up to 76% compared to results of canonical models of the target language. We also investigate accentindependent approaches in order to assess the robustness of the proposed methods to unseen foreign accents. Experiments show that our approaches correctly handle unseen accents and give up to 55% WER reduction, compared to the models of the target language. Finally, the proposed pronunciation modeling approach maintains the recognition accuracy on canonical native speech as assessed by our experiments on the TIMIT corpus.

[1] Jean Paul Haton,et al. Discriminative phoneme sequence extraction for non-native speaker’s origin classification , 2007, 2007 9th International Symposium on Signal Processing and Its Applications.

[2] Katarina Bartkova,et al. Multiple models for improved speech recognition for non-native speakers , 2004 .

[3] Hong Kook Kim,et al. Acoustic Model Adaptation Based on Pronunciation Variability Analysis for Non-Native Speech Recognition , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[4] John J. Morgan,et al. Making a Speech Recognizer Tolerate Non-native Speech through Gaussian Mixture Merging , 2004 .

[5] Katarina Bartkova,et al. Using Multilingual Units for Improved Modeling of Pronunciation Variants , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[6] Irina Illina,et al. Combined acoustic and pronunciation modelling for non-native speech recognition , 2007, INTERSPEECH.