论文信息 - Brno University of Technology System for NIST 2005 Language Recognition Evaluation

Brno University of Technology System for NIST 2005 Language Recognition Evaluation

This paper presents the language identification (LID) system developed in Speech@FIT group at Brno University of Technology (BUT) for NIST 2005 Language Recognition Evaluation. The system consists of two parts: phonotactic and acoustic. Phonotactic system is based on hybrid phoneme recognizers trained on SpeechDat-E database. Phoneme lattices are used to train and test phonotactic language models. Further improvement is obtained by using anti-models. Acoustic system is based on GMM modeling trained under maximum mutual information framework. We describe both parts and provide a discussion of performance on LRE 2005 recognition task

Lukás Burget | Pavel Matejka | Jan Cernocký | Petr Schwarz

[1] Pavel Matejka,et al. Towards Lower Error Rates in Phoneme Recognition , 2004, TSD.

[2] Lukás Burget,et al. Use of Anti-Models to Further Improve State-of-the-Art PRLM Language Recognition System , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[3] Jean-Luc Gauvain,et al. Language recognition using phone latices , 2004, INTERSPEECH.

[4] Ian H. Witten,et al. The zero-frequency problem: Estimating the probabilities of novel events in adaptive text compression , 1991, IEEE Trans. Inf. Theory.

[5] William M. Campbell,et al. Acoustic, phonetic, and discriminative approaches to automatic language identification , 2003, INTERSPEECH.

[6] Lukás Burget,et al. Discriminative Training Techniques for Acoustic Language Identification , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[7] Pavel Matejka,et al. Phonotactic language identification using high quality phoneme recognition , 2005, INTERSPEECH.

[8] Andreas Stolcke,et al. SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[9] Jordan Cohen,et al. Vocal tract normalization in speech recognition: Compensating for systematic speaker variability , 1995 .

[10] Jonathan Le Roux,et al. Discriminative Training for Large-Vocabulary Speech Recognition Using Minimum Classification Error , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[11] Pavel Matejka,et al. Hierarchical Structures of Neural Networks for Phoneme Recognition , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[12] Steve Young,et al. The HTK book , 1995 .

[13] Marc A. Zissman,et al. Comparison of : Four Approaches to Automatic Language Identification of Telephone Speech , 2004 .

[14] Douglas A. Reynolds,et al. Approaches to language identification using Gaussian mixture models and shifted delta cepstral features , 2002, INTERSPEECH.

[15] Alvin F. Martin,et al. NIST 2003 language recognition evaluation , 2003, INTERSPEECH.