论文信息 - Unifying PLDA and polynomial kernel SVMS

Unifying PLDA and polynomial kernel SVMS

Probabilistic linear discriminant analysis (PLDA) is a generative model to explain between and within class variations. When the underlying latent variables are modelled by standard Gaussian distributions, the PLDA recognition scores can be evaluated as a dot product between a high dimensional PLDA feature vector and a weight vector. A key contribution of this paper is showing that the high dimensional PLDA feature vectors can be equivalently (in a non-strict sense) represented as the second-degree polynomial kernel induced features of the vectors formed by concatenating the two input vectors constituting a trial. This equivalence relationship paves the way for the speaker recognition problem to be viewed as a two-class support vector machine (SVM) training problem where higher degree polynomial kernels can give better discriminative power. To alleviate the large scale SVM training problem, we propose a kernel evaluation trick that greatly simplifies the kernel evaluation operations. In our experiments, a combination of multiple second degree polynomial kernel SVMs performed comparably to a state-of-the-art PLDA system. For the analysed test case, SVMs trained with third degree polynomial kernel reduced the EERs on average by 10% relative to that of the SVMs trained with second degree polynomial kernel.

Jason W. Pelecanos | Weizhong Zhu | Sibel Yaman

[1] Thorsten Joachims,et al. Training linear SVMs in linear time , 2006, KDD '06.

[2] Lukás Burget,et al. Full-covariance UBM and heavy-tailed PLDA in i-vector speaker verification , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[3] Mohamed Kamal Omar,et al. On the Use of Non-Linear Polynomial Kernel SVMs in Language Recognition , 2012, INTERSPEECH.

[4] Patrick Kenny,et al. Front-End Factor Analysis for Speaker Verification , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[5] Chih-Jen Lin,et al. A dual coordinate descent method for large-scale linear SVM , 2008, ICML '08.

[6] Vladimir N. Vapnik,et al. The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[7] Pietro Laface,et al. Fast discriminative speaker verification in the i-vector space , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[8] James H. Elder,et al. Probabilistic Linear Discriminant Analysis for Inferences About Identity , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[9] Nello Cristianini,et al. Kernel Methods for Pattern Analysis , 2004 .

[10] Pietro Laface,et al. Pairwise Discriminative Speaker Verification in the ${\rm I}$-Vector Space , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[11] Lukás Burget,et al. Discriminatively trained Probabilistic Linear Discriminant Analysis for speaker verification , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).