I3A Language Recognition System for Albayzin 2010 LRE

This paper describes the two systems submitted to the Albayzin 2010 Language Recognition Evaluation by I3A. This evaluation is similar to the one organized by NIST every 2 years, but the languages to be recognized are those spoken in the Iberian peninsula (Spanish, Catalan, Basque, Galician and Portuguese) plus English. Both submissions are a fusion of five phonotactic and three acoustic subsystems. The only difference between them is the normalization and fusion of the scores. State-of-the art methods for Language Recognition are adapted to and investigated in the KALAKA-2 database. Our primary system was ranked in the first position of the evaluation.

[1]  Lukás Burget,et al.  Discriminative acoustic language recognition via channel-compensated GMM statistics , 2009, INTERSPEECH.

[2]  Jean-Luc Gauvain,et al.  Language recognition using phone latices , 2004, INTERSPEECH.

[3]  Khalid Choukri,et al.  SPEECHDAT-CAR. A Large Speech Database for Automotive Environments , 2000, LREC.

[4]  Mireia Díez,et al.  The Albayzin 2010 Language Recognition Evaluation , 2011, INTERSPEECH.

[5]  Douglas A. Reynolds,et al.  Approaches to language identification using Gaussian mixture models and shifted delta cepstral features , 2002, INTERSPEECH.

[6]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[7]  Lukás Burget,et al.  Comparison of scoring methods used in speaker recognition with Joint Factor Analysis , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[8]  Germn Bordel,et al.  Overview of the Albayzin 2010 Language Recognition Evaluation: database design, evaluation plan and , 2010 .

[9]  José B. Mariño,et al.  Albayzin speech database: design of the phonetic corpus , 1993, EUROSPEECH.

[10]  Andreas Stolcke,et al.  SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[11]  Patrick Kenny,et al.  Joint Factor Analysis of Speaker and Session Variability: Theory and Algorithms , 2006 .

[12]  Marc A. Zissman,et al.  Comparison of : Four Approaches to Automatic Language Identification of Telephone Speech , 2004 .

[13]  David A. van Leeuwen,et al.  Channel-dependent GMM and Multi-class Logistic Regression models for language recognition , 2006, Odyssey.

[14]  Sridha Sridharan,et al.  Feature warping for robust speaker verification , 2001, Odyssey.

[15]  Jonathan Le Roux,et al.  Discriminative Training for Large-Vocabulary Speech Recognition Using Minimum Classification Error , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[16]  Roland Auckenthaler,et al.  Score Normalization for Text-Independent Speaker Verification Systems , 2000, Digit. Signal Process..