The GlobalPhone Project: Multilingual LVCSR with JANUS-3

This paper describes our recent e ort in developing the GlobalPhone database for multilingual large vocabulary continuous speech recognition. In particular we present the current status of the GlobalPhone corpus containing high quality speech data for the 9 languages Arabic, Chinese, Croatic, Japanese, Korean, Portuguese, Russian, Spanish, and Turkish. We also discuss the JANUS-3 toolkit and how it can be applied on our way towards multilinguality using the GlobalPhone database.

[1]  Alon Lavie,et al.  Janus-III: speech-to-speech translation in multiple languages , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  Tanja Schultz,et al.  Janus: Towards Multilingual Spoken Language Translation , 1995 .

[3]  Tanja Schultz,et al.  Fast bootstrapping of LVCSR systems with multilingual phoneme sets , 1997, EUROSPEECH.

[4]  Tanja Schultz,et al.  LVCSR-based language identification , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.