论文信息 - 12 Kernel Based Text-Independnent Speaker Verification

12 Kernel Based Text-Independnent Speaker Verification

The goal of a person authentication system is to authenticat e the claimed identity of a user. When this authentication is based on the voice of the user, wi thout respect of what the user exactly said, the system is called a text-independent speak er verification system. Speaker verification systems are increasingly often used to secure personal information, particularly for mobile phone based applications. Further more, text-independent versions of speaker verification systems are the most used for their simp licity, as they do not require complex speech recognition modules. The most common approa ch t this task is based on Gaussian Mixture Models (GMMs) (Reynolds et al. 2000), whic h do not take into account any temporal information. GMMs have been intensively used t hanks to their good performance, especially with the use of the Maximum A Posteriori (M AP) (Gauvain and Lee 1994) adaptation algorithm. This approach is based on the de nsity estimation of an impostor data distribution, followed by its adaptation to a specific c lient data set. Note that the estimation of these densities is not the final goal of speaker verific at on systems, which is rather to discriminate the client and impostor classes; hence discri minative approaches might appear good candidates for this task as well. As a matter of fact, Support Vector Machine (SVM) based syste m have been the subject of several recent publications in the speaker verificat on community, in which they obtain similar to or even better performance than GMMs on sev eral text-independent speaker

Samy Bengio | Johnny Mariéthoz | Yves Grandvalet

[1] Douglas E. Sturim,et al. Support vector machines using GMM supervectors for speaker verification , 2006, IEEE Signal Processing Letters.

[2] T. Joachims. Support Vector Machines , 2002 .

[3] Samy Bengio,et al. A statistical significance test for person authentication , 2004, Odyssey.

[4] Samy Bengio,et al. The Expected Performance Curve , 2003, ICML 2003.

[5] Yann LeCun,et al. Learning a similarity metric discriminatively, with application to face verification , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[6] Sahibsingh A. Dudani. The Distance-Weighted k-Nearest-Neighbor Rule , 1976, IEEE Transactions on Systems, Man, and Cybernetics.

[7] Massimiliano Pontil,et al. Support Vector Machines for 3D Object Recognition , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[8] Frédéric Bimbot,et al. A comparative evaluation of variance flooring techniques in HMM-based speaker verification , 1998, ICSLP.

[9] Guy Lebanon,et al. Metric learning for text documents , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10] Koby Crammer,et al. Kernel Design Using Boosting , 2002, NIPS.

[11] Samy Bengio,et al. Kernel Based Text-Independnent Speaker Verification , 2009 .

[12] David Haussler,et al. Exploiting Generative Models in Discriminative Classifiers , 1998, NIPS.

[13] Douglas A. Reynolds,et al. Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[14] Steve Renals,et al. SVMSVM: support vector machine speaker verification methodology , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[15] Richard O. Duda,et al. Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[16] Kilian Q. Weinberger,et al. Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[17] S. Renals,et al. Speaker Verification Using Sequence Discriminant , 2005 .

[18] William M. Campbell,et al. Channel compensation for SVM speaker recognition , 2004, Odyssey.

[19] Steve Renals,et al. Speaker verification using sequence discriminant support vector machines , 2005, IEEE Transactions on Speech and Audio Processing.

[20] Patrick Kenny,et al. Eigenvoice modeling with sparse training data , 2005, IEEE Transactions on Speech and Audio Processing.

[21] William M. Campbell,et al. Support vector machines for speaker and language recognition , 2006, Comput. Speech Lang..

[22] Samy Bengio,et al. A kernel trick for sequences applied to text-independent speaker verification systems , 2007, Pattern Recognit..

[23] Roberto Basili,et al. Learning to Classify Text Using Support Vector Machines: Methods, Theory, and Algorithms by Thorsten Joachims , 2003, Comput. Linguistics.

[24] Alvin F. Martin,et al. The DET curve in assessment of detection task performance , 1997, EUROSPEECH.

[25] Chin-Hui Lee,et al. Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains , 1994, IEEE Trans. Speech Audio Process..

[26] William M. Campbell,et al. Generalized linear discriminant sequence kernels for speaker recognition , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[27] Nello Cristianini,et al. Learning the Kernel Matrix with Semidefinite Programming , 2002, J. Mach. Learn. Res..

[28] Vladimir N. Vapnik,et al. The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.