Automatic Language Identification Using Phoneme and Automatically Derived Unit Strings

Language identification (LID) based on phonotactic modeling is presented in this paper. Approaches using phoneme strings and strings of units automatically derived by an Ergodic HMM (EHMM) are compared. The phoneme recognizers were trained on 6 languages from OGI multi-language-corpus and Czech SpeechDat-E. The LID results are obtained on 4 languages. The results show superiority of Czech phoneme recognizer while used in LID and promising trends using the EHMM-derived units.

[1]  Elsie Fogerty Speech , 1933, Encyclopedia of Evolutionary Psychological Science.

[2]  Gérard Chollet,et al.  Segmental vocoder-going beyond the phonetic approach , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[3]  Hynek Hermansky,et al.  Temporal patterns (TRAPs) in ASR of noisy speech , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[4]  Daniel P. W. Ellis,et al.  Feature extraction using non-linear transformation for robust speech recognition on the Aurora database , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[5]  Pavel Matejka,et al.  Recognition of phoneme strings using TRAP technique , 2003, INTERSPEECH.

[6]  Pavel Matejka,et al.  Towards Lower Error Rates in Phoneme Recognition , 2004, TSD.