Performances Evaluation of GMM-UBM and GMM-SVM for Speaker Recognition in Realistic World

In this paper, an automatic speaker recognition system for realistic environments is presented. In fact, most of the existing speaker recognition methods, which have shown to be highly efficient under noise free conditions, fail drastically in noisy environments. In this work, features vectors, constituted by the Mel Frequency Cepstral Coefficients (MFCC) extracted from the speech signal are used to train the Support Vector Machines (SVM) and Gaussian mixture model (GMM). To reduce the effect of noisy environments the cepstral mean subtraction (CMS) are applied on the MFCC. For both, GMM-UBM and GMM-SVM systems, 2048-mixture UBM is used. The recognition phase was tested with Arabic speakers at different Signal-to-Noise Ratio (SNR) and under three noisy conditions issued from NOISEX-92 data base. The experimental results showed that the use of appropriate kernel functions with SVM improved the global performance of the speaker recognition in noisy environments.

[1]  Douglas E. Sturim,et al.  Support vector machines using GMM supervectors for speaker verification , 2006, IEEE Signal Processing Letters.

[2]  Patrick Kenny,et al.  Linear and non linear kernel GMM supervector machines for speaker verification , 2007, INTERSPEECH.

[3]  Frédéric Bimbot,et al.  D-MAP: a distance-normalized MAP estimation of speaker models for automatic speaker verification , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[4]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[5]  Pedro J. Moreno,et al.  A Generative Model Based Kernel for SVM Classification in Multimedia Applications , 2004 .

[6]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[7]  J. Picone,et al.  Speaker Verification using Support Vector Machines , 2006, Proceedings of the IEEE SoutheastCon 2006.

[8]  Xin Dong,et al.  Speaker recognition using continuous density support vector machines , 2001 .

[9]  Abderrahmane Amrouche,et al.  An efficient speech recognition system in adverse conditions using the nonparametric regression , 2010, Eng. Appl. Artif. Intell..

[10]  Yanlu Xie,et al.  A New Hybrid GMM/SVM for Speaker Verification , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[11]  Douglas E. Sturim,et al.  SVM Based Speaker Verification using a GMM Supervector Kernel and NAP Variability Compensation , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[12]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.

[13]  Douglas A. Reynolds,et al.  Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..