论文信息 - Improving pronunciation modeling for non-native speech recognition

Improving pronunciation modeling for non-native speech recognition

In this paper, three different approaches to pronunciation modeling are investigated. Two existing pronunciation modeling approaches, namely the pronunciation dictionary and n-best rescoring approach are modified to work with little amount of non-native speech. We also propose a speaker clustering approach, which capable of grouping the speakers based on their pronunciation habits. Given some speech, the approach can also be used for pronunciation adaptation. This approach is called latent pronunciation analysis. The results show that conventional pronunciation dictionary perform slightly better than n-best list rescoring, while the latent pronunciation analysis has shown to be beneficial for speaker clustering, and it can produce nearly the same improvement as the pronunciation dictionary approach, without the need to know the origin of the speaker.

Tien Ping Tan | Laurent Besacier | L. Besacier | T. Tan

[1] Silke Goronzy,et al. Robust Adaptation to Non-Native Accents in Automatic Speech Recognition , 2002, Lecture Notes in Computer Science.

[2] Tien Ping Tan,et al. Modeling context and language variation for non-native speech recognition , 2007, INTERSPEECH.

[3] Satoshi Nakamura,et al. A statistical lexicon for non-native speech recognition , 2004, INTERSPEECH.

[4] T. Tan. A French Non-Native Corpus for Automatic Speech Recognition , 2006 .

[5] Helmer Strik,et al. Modeling pronunciation variation for ASR: A survey of the literature , 1999, Speech Commun..

[6] Philip C. Woodland,et al. Using accent-specific pronunciation modelling for improved large vocabulary continuous speech recognition , 1997, EUROSPEECH.

[7] Ralf Kompe,et al. Generating non-native pronunciation variants for lexicon adaptation , 2004, Speech Commun..

[8] Maxine Eskénazi,et al. BREF, a large vocabulary spoken corpus for French , 1991, EUROSPEECH.

[9] Eric Atwell,et al. The ISLE Corpus of Non-Native Spoken English , 2000, LREC.

[10] Roland Kuhn,et al. Eigenvoices for speaker adaptation , 1998, ICSLP.

[11] Paul Taylor,et al. The architecture of the Festival speech synthesis system , 1998, SSW.

[12] Manuela Boros,et al. Recognition of non-native German speech with multilingual recognizers , 1999, EUROSPEECH.