Speaker identification evaluation based on the speech biometric and i-vector model using the TIMIT and NTIMIT databases

Physiological and behavioural human characteristics are exploited in biometrics and performance metrics are used to measure some characteristic of an individual. The measure might lead to a one-to-one match, which is called authentication or one-from-N, and a match represents identification. In this paper, we exploit a speech biometric I-vector with low and fixed dimension of 100 to identify speakers. The main structure of the system consists of an I-vector with three fusion methods. It has low complexity and is efficient due to using an Extreme Learning Machine (ELM) classifier. The system is evaluated with 120 speakers from dialect regions one and four from both the TIMIT and NTIMIT databases in order to provide a fair comparison with our previous study based on the traditional Gaussian Mixture Model-Universal Background Model (GMM-UBM) with a Maximum Likelihood (ML) classifier system. The system shows identification rate improvement compared with the classical GMM-UBM.

[1]  Pradip K. Das,et al.  i-Vectors in speech processing applications: a survey , 2015, Int. J. Speech Technol..

[2]  Kuldip K. Paliwal,et al.  USE OF VOICING AND PITCH INFORMATION FOR SPEAKER RECOGNITION , 2000 .

[3]  D.A. Reynolds,et al.  Large population speaker identification using clean and telephone speech , 1995, IEEE Signal Processing Letters.

[4]  Joachim Wilke,et al.  A further investigation on AR-vector models for text-independent speaker identification , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[5]  Hongming Zhou,et al.  Extreme Learning Machines [Trends & Controversies] , 2013 .

[6]  Ivan Magrin-Chagnolleau,et al.  Second-order statistical measures for text-independent speaker identification , 1995, Speech Commun..

[7]  Douglas A. Reynolds,et al.  Robust text-independent speaker identification using Gaussian mixture speaker models , 1995, IEEE Trans. Speech Audio Process..

[8]  Kai Kang,et al.  I-vector based text-independent speaker identification , 2014, Proceeding of the 11th World Congress on Intelligent Control and Automation.

[9]  Claude Montacié,et al.  Speaker recognition experiments on the NTIMIT database , 1995, EUROSPEECH.

[10]  Wai Lok Woo,et al.  Study of fusion strategies and exploiting the combination of MFCC and PNCC features for robust biometric speaker identification , 2016, 2016 4th International Conference on Biometrics and Forensics (IWBF).

[11]  Douglas A. Reynolds,et al.  Speaker identification and verification using Gaussian mixture speaker models , 1995, Speech Commun..

[12]  Yuan Lan,et al.  An extreme learning machine approach for speaker recognition , 2012, Neural Computing and Applications.

[13]  Kallirroi Georgila,et al.  A continuous HMM text-independent speaker recognition system based on vowel spotting , 1997, EUROSPEECH.

[14]  Florin Curelaru,et al.  Front-End Factor Analysis For Speaker Verification , 2018, 2018 International Conference on Communications (COMM).

[15]  S. Selva Nidhyananthan,et al.  Fused Mel Feature sets based Text-Independent Speaker Identification using Gaussian Mixture Model , 2012 .

[16]  Lukás Burget,et al.  Analysis of DNN approaches to speaker identification , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[17]  Heinz Hertlein,et al.  Effectiveness in open-set speaker identification , 2014, 2014 International Carnahan Conference on Security Technology (ICCST).

[18]  Jayanth,et al.  Speaker Identification based on GFCC using GMM-UBM , 2016 .

[19]  Wissam A. Jassim,et al.  A Robust Speaker Identification System Using the Responses from a Model of the Auditory Periphery , 2016, PloS one.

[20]  Patrick Kenny A small footprint i-vector extractor , 2012, Odyssey.

[21]  S. Selva Nidhyananthan,et al.  A Framework for Multilingual Text- Independent speaker identification System , 2014, J. Comput. Sci..

[22]  Victor C. M. Leung,et al.  Extreme Learning Machines [Trends & Controversies] , 2013, IEEE Intelligent Systems.

[23]  Bayya Yegnanarayana,et al.  Speaker-specific mapping for text-independent speaker recognition , 2003, Speech Commun..

[24]  D. Reynolds Automatic Speaker Recognition Using Gaussian Mixture Speaker Models , 1995 .

[25]  Kuldip K. Paliwal,et al.  GMM Based Speaker Recognition on Readily Available Databases , 2003 .

[26]  R Togneri,et al.  An Overview of Speaker Identification: Accuracy and Robustness Issues , 2011, IEEE Circuits and Systems Magazine.

[27]  Wai Lok Woo,et al.  Study of statistical robust closed set speaker identification with feature and score-based fusion , 2016, 2016 IEEE Statistical Signal Processing Workshop (SSP).