Data selection and calibration issues in automatic language recognition - investigation with BUT-AGNITIO NIST LRE 2009 system

This paper summarizes the BUT-AGNITIO system for NIST Language Recognition Evaluation 2009. The post-evaluatio n analysis aimed mainly at improving the quality of the data (fi xing language label problems and detecting overlapping spea kers in the training and development sets) and investigation of d ifferent compositions of the development set. The paper furth er investigates into JFA-based acoustic system and reports re sults for new SVM-PCA systems going beyond BUT-Agnitio original NIST LRE 2009 submission. All results are presented on evaluation data from NIST LRE 2009 task.

[1]  Hermann Ney,et al.  Improved methods for vocal tract normalization , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[2]  Douglas A. Reynolds,et al.  Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[3]  Jirí Navrátil,et al.  Spoken language recognition-a step toward multilinguality in speech processing , 2001, IEEE Trans. Speech Audio Process..

[4]  Douglas A. Reynolds,et al.  Approaches to language identification using Gaussian mixture models and shifted delta cepstral features , 2002, INTERSPEECH.

[5]  Jean-Luc Gauvain,et al.  Language recognition using phone latices , 2004, INTERSPEECH.

[6]  Pavel Matejka,et al.  Towards Lower Error Rates in Phoneme Recognition , 2004, TSD.

[7]  Geoffrey Zweig,et al.  fMPE: discriminatively trained features for speech recognition , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[8]  Richard M. Schwartz,et al.  Recent progress on the discriminative region-dependent transform for speech feature extraction , 2006, INTERSPEECH.

[9]  Pavel Matejka,et al.  Hierarchical Structures of Neural Networks for Phoneme Recognition , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[10]  Lukás Burget,et al.  Brno University of Technology System for NIST 2005 Language Recognition Evaluation , 2006, 2006 IEEE Odyssey - The Speaker and Language Recognition Workshop.

[11]  Jirí Navrátil,et al.  Recent advances in phonotactic language recognition using binary-decision trees , 2006, INTERSPEECH.

[12]  Pietro Laface,et al.  Language Identification using Acoustic Models and Speaker Compensated Cepstral-Time Matrices , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[13]  Jonathan Le Roux,et al.  Discriminative Training for Large-Vocabulary Speech Recognition Using Minimum Classification Error , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[14]  Patrick Kenny,et al.  Joint Factor Analysis Versus Eigenchannels in Speaker Recognition , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[15]  William M. Campbell,et al.  Language Recognition with Word Lattices and Support Vector Machines , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[16]  David A. van Leeuwen,et al.  STBU System for the NIST 2006 Speaker Recognition Evaluation , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[17]  Lukás Burget,et al.  The AMI System for the Transcription of Speech in Meetings , 2007, ICASSP.

[18]  Lukás Burget,et al.  Acquisition of Telephone Data from Radio Broadcasts with Applications to Language Recognition , 2008, TSD.

[19]  Patrick Kenny,et al.  Development of the primary CRIM system for the NIST 2008 speaker recognition evaluation , 2008, INTERSPEECH.

[20]  Lukás Burget,et al.  Discriminative training and channel compensation for acoustic language recognition , 2008, INTERSPEECH.

[21]  Lukás Burget,et al.  Discriminative acoustic language recognition via channel-compensated GMM statistics , 2009, INTERSPEECH.

[22]  Lukás Burget,et al.  Comparison of scoring methods used in speaker recognition with Joint Factor Analysis , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[23]  Lukás Burget,et al.  PCA-based Feature Extraction for Phonotactic Language Recognition , 2010, Odyssey.