Speaker recognition using channel factors feature compensation

The variability of the channel and environment is one of the most important factors affecting the performance of text-independent speaker verification systems. The best techniques for channel compensation are model based. Most of them have been proposed for Gaussian Mixture Models, while in the feature domain typically blind channel compensation is performed. The aim of this work is to explore techniques that allow more accurate channel compensation in the domain of the features. Compensating the features rather than the models has the advantage that the transformed parameters can be used with models of different nature and complexity, and also for different tasks. In this paper we evaluate the effects of the compensation of the channel variability obtained by means of the channel factors approach. In particular, we compare channel variability modeling in the usual Gaussian Mixture model domain, and our proposed feature domain compensation technique. We show that the two approaches lead to similar results on the NIST 2005 Speaker Recognition Evaluation data. Moreover, the quality of the transformed features is also assessed in the Support Vector Machines framework for speaker recognition on the same data, and in preliminary experiments on Language Identification.

[1]  William M. Campbell,et al.  Generalized linear discriminant sequence kernels for speaker recognition , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  Tsuhan Chen,et al.  Improved speaker verification through probabilistic subspace adaptation , 2003, INTERSPEECH.

[3]  Douglas A. Reynolds,et al.  Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[4]  Roland Kuhn,et al.  Rapid speaker adaptation in eigenvoice space , 2000, IEEE Trans. Speech Audio Process..

[5]  Patrick Kenny,et al.  Disentangling speaker and channel effects in speaker verification , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  Roland Auckenthaler,et al.  Score Normalization for Text-Independent Speaker Verification Systems , 2000, Digit. Signal Process..

[7]  Sridha Sridharan,et al.  Feature warping for robust speaker verification , 2001, Odyssey.

[8]  Douglas A. Reynolds,et al.  Channel robust speaker verification via feature mapping , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[9]  Sridha Sridharan,et al.  Modelling session variability in text-independent speaker verification , 2005, INTERSPEECH.

[10]  William M. Campbell,et al.  Advances in channel compensation for SVM speaker recognition , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[11]  Patrick Kenny,et al.  Eigenvoice modeling with sparse training data , 2005, IEEE Transactions on Speech and Audio Processing.

[12]  Douglas A. Reynolds,et al.  Approaches to language identification using Gaussian mixture models and shifted delta cepstral features , 2002, INTERSPEECH.