A GMM-supervector approach to language recognition with adaptive relevance factor

Gaussian mixture model (GMM) supervector has been proven effective for language recognition. While a speech utterance can be represented with a GMM which can be obtained through maximum a posteriori (MAP) criterion, it is observed that the supervector formed from the GMM encounters shifting problem in the supervector space due to varying duration of the utterance. We propose an adaptive relevance factor for the MAP estimation to mitigate the negative effect of the variability of individual utterances. Moreover, we develop a language recognition system with a Bhattacharyya-based kernel where the information from the mean vectors and covariance matrices are separately assigned into corresponding dissimilarities. We show the effectiveness of the proposed adaptive relevance factor and the Bhattacharyya-based kernel on the National Institute of Standards and Technology (NIST) language recognition evaluation (LRE) 2009 task.