The Adaptation Schemes In PR-SVM Based Language Recognition

Phonetic-based systems usually convert the input speech into token (i.e. word, phone etc.) sequence and determine the target language from the statistics of the token sequences on different languages. Generally, there are two kinds of statistical representation for token sequences, N-gram language model (PR-LM) and support vector machines (PR- SVM) to perform language classification. In this paper we focus on PR-SVM method. One problem of the PR-SVM is that the statistical representation based on utterance is sparse and inaccurate. To tackle this issue, the adaptation schemes in PR-SVM framework are proposed in this paper. There are two schemes to be used: 1) Adaptation from the Universal N-gram Language Model (UNLM) trained on all languages; 2) Adaptation from the Low-Order N-gram Language Model (LONLM). The experimental results on 2007 NIST LRE tasks show that our method achieves significant gains over the unadapted model.

[1]  William M. Campbell,et al.  High-level speaker verification with support vector machines , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  Rong Tong,et al.  Discriminative Vector for Spoken Language Recognition , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[3]  William M. Campbell,et al.  Language recognition with discriminative keyword selection , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[4]  Samy Bengio,et al.  SVMTorch: Support Vector Machines for Large-Scale Regression Problems , 2001, J. Mach. Learn. Res..

[5]  Douglas E. Sturim,et al.  SVM Based Speaker Verification using a GMM Supervector Kernel and NAP Variability Compensation , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[6]  Marc A. Zissman,et al.  Comparison of : Four Approaches to Automatic Language Identification of Telephone Speech , 2004 .

[7]  Pavel Matejka,et al.  Hierarchical Structures of Neural Networks for Phoneme Recognition , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[8]  Jean-Luc Gauvain,et al.  Language recognition using phone latices , 2004, INTERSPEECH.

[9]  William M. Campbell,et al.  Language Recognition with Word Lattices and Support Vector Machines , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.