Language adaptation of multilingual phone models for vocabulary independent speech recognition tasks

This paper presents our new results on multilingual phone modeling and adaptation into a new target language which is not included in the trained multilingual models. The experiments were carried out with the SpeechDat(M) and MacroPhone databases including the languages of French, German, Italian, Portuguese, Spanish and American English. First, we constructed language-dependent and multilingual phone models. The recognition rate for an isolated word task decreased in average only by 3.2% using 95 multilingual instead of 232 language-dependent models. Second, we investigated adaptation techniques for cross-language transfer and showed that only 100 utterances from a new language were needed for adaptation. Using the MAP algorithm the recognition rate was improved from 79.9% to 84.3%. Finally, we defined a phonetic based dissimilarity measure between 2 languages and compared language-dependent and multilingual models for the purpose of cross-language transfer.

[1]  Jean-Luc Gauvain,et al.  Language identification with language-independent acoustic models , 1997, EUROSPEECH.

[2]  Kazuhiro Kondo,et al.  An evaluation of cross-language adaptation for rapid HMM development in a new language , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[3]  Patrizia Bonaventura,et al.  Multilingual speech recognition for flexible vocabularies , 1997, EUROSPEECH.

[4]  Paul Dalsgaard,et al.  Data-driven identification of poly- and mono-phonemes for four european languages , 1993, EUROSPEECH.

[5]  Erwin Marschall,et al.  METHODS FOR IMPROVED SPEECH RECOGNITION OVER TELEPHONE LINES , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[6]  Tanja Schultz,et al.  Fast bootstrapping of LVCSR systems with multilingual phoneme sets , 1997, EUROSPEECH.

[7]  Jen-Tzung Chien,et al.  Improved Bayesian learning of hidden Markov models for speaker adaptation , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[8]  Joachim Köhler,et al.  Multi-lingual phoneme recognition exploiting acoustic-phonetic similarities of sounds , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[9]  Joachim Köhler,et al.  In-service adaptation of multilingual hidden-Markov-models , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.