Robust HMM training for unified dutch and German speech recognition

This paper describes our recent work in developing an unified Dutch and German speech recognition system in the SpeechDat domain. The acoustic component of the multiligual system is accomplished through sharing common phonemes without preserving any information about the languages. We propose a more robust MCE-based training algorithm, where only the language dependent phoneme models are allowed to be adjusted, according to the type of training data. Experimental results on Dutch and German subword recognition tasks clearly show an overall string error rate reduction of about 7% and 13% obtained by the newly trained unified recognizer in comparison with the conventional MCE-trained multilingual system.

[1]  Lou Boves,et al.  Creation and analysis of the dutch polyphone corpus , 1994, ICSLP.

[2]  Tanja Schultz,et al.  Experiments on cross-language acoustic modeling , 2001, INTERSPEECH.

[3]  Joachim Köhler,et al.  Multi-lingual phoneme recognition exploiting acoustic-phonetic similarities of sounds , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[4]  Rathinavelu Chengalvarayan Accent-independent universal HMM-based speech recognizer for american, australian and british English , 2001, INTERSPEECH.

[5]  Biing-Hwang Juang,et al.  The segmental K-means algorithm for estimating parameters of hidden Markov models , 1990, IEEE Trans. Acoust. Speech Signal Process..

[6]  Lori Lamel,et al.  Multilingual phone recognition of spontaneous telephone speech , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[7]  Yuqing Gao,et al.  Speaker-independent upfront dialect adaptation in a large vocabulary continuous speech recognizer , 1998, ICSLP.

[8]  Patrizia Bonaventura,et al.  Multilingual speech recognition for flexible vocabularies , 1997, EUROSPEECH.

[9]  Andreas Stolcke,et al.  A study of multilingual speech recognition , 1997, EUROSPEECH.

[10]  Rathinavelu Chengalvarayan,et al.  Robust energy normalization using speech/nonspeech discriminator for German connected digit recognition , 1999, EUROSPEECH.

[11]  Heinrich Niemann,et al.  Bilingual and dialectal adaptation and retraining , 1998, ICSLP.

[12]  Biing-Hwang Juang,et al.  Minimum classification error rate methods for speech recognition , 1997, IEEE Trans. Speech Audio Process..

[13]  George Saon,et al.  Real-time multilingual HMM training robust to channel variations , 2000, INTERSPEECH.

[14]  Imre Kiss,et al.  Speaker- and language-independent speech recognition in mobile communication systems , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[15]  Amro El-Jaroudi,et al.  Multilingual speech recognition: the 1996 byblos callhome system , 1997, EUROSPEECH.