Nonlinear kernel nuisance attribute projection for speaker verification

Nuisance attribute projection (NAP) was successfully applied in SVM-based speaker verification systems to improve performance by doing projection to remove dimensions from the SVM feature space that cause unwanted variability in the kernel. Previous studies of NAP were focused mainly on linear and generalized linear kernel SVMs. In this paper, NAP in nonlinear kernel SVMs, e.g. polynomial or Gaussian kernels, are investigated. Instead of doing explicit feature expansion and projection in high-dimension feature space, kernel principal component analysis is employed to find nuisance dimensions; and, NAP is carried out implicitly by incorporating it into some compensated kernel functions. Experimental results on the 2006 NIST SRE corpus indicate the effectiveness of such nonlinear kernel NAP. Compared with linear NAP, nonlinear NAP with Gaussian kernel obtained about 11% relative improvement in equal error rate (EER).

[1]  W. M. Campbell Compensating for Mismatch in High-Level Speaker Recognition , 2006, 2006 IEEE Odyssey - The Speaker and Language Recognition Workshop.

[2]  Douglas A. Reynolds,et al.  Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[3]  William M. Campbell,et al.  Advances in channel compensation for SVM speaker recognition , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[4]  Yuan Dong,et al.  Svm-Based Speaker Verification by Location in the Space of Reference Speakers , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[5]  Samy Bengio,et al.  SVMTorch: Support Vector Machines for Large-Scale Regression Problems , 2001, J. Mach. Learn. Res..

[6]  Patrick Kenny,et al.  Linear and non linear kernel GMM supervector machines for speaker verification , 2007, INTERSPEECH.

[7]  Douglas E. Sturim,et al.  Speaker indexing in large audio databases using anchor models , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[8]  Andreas Stolcke,et al.  NAP and WCCN: Comparison of Approaches using MLLR-SVM Speaker Verification System , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[9]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.