Experiments for an approach to language identification with conversational telephone speech

This paper presents work on language identification research using conversational speech (the LDC Conversational Telephone Speech Database). The baseline system used in this study is based on language-dependent phone recognition and phonotactic constraints. The system was trained using monologue data and obtained an error rate of around 9% on a commonly used nine-language monologue test set. While the system was used to process conversational speech from the same nine-language task, dramatic performance degradation (with an error rate of 40%) was observed. Based on our analysis of conversational speech, two methods: (1) pre-processing and, (2) post-processing, were proposed. Without the presence of training data from conversational speech database, the final system (the baseline system enhanced by the two proposed methods) obtained an error rate of 24%, a substantial improvement (with 41% error reduction) compared with the baseline system.