A Speaker Identification System Using MFCC Features with VQ Technique

The performance of speaker identification systems has improved due to recent advances in speech processing techniques but there is still need of improvement in term of text-independent speaker identification and suitable modelling techniques for voice feature vectors. It becomes difficult for person to recognize a voice when an uncontrollable noise adds in to it. In this paper, feature vectors from speech are extracted by using Mel-Frequency Cepstral Coefficients and Vector Quantization technique is implemented through Linde-Buzo-Gray algorithm. Two purposeful speech databases with added noise, recorded at sampling frequencies 8000 Hz and 11025 Hz, are used to check the accuracy of the developed speaker identification system in non-ideal conditions. An analysis is also provided by performing different experiments on the databases that number of vectors in VQ codebook and sampling frequency influence the identification accuracy significantly.

[1]  Robert M. Gray,et al.  An Algorithm for Vector Quantizer Design , 1980, IEEE Trans. Commun..

[2]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[3]  B.S. Atal,et al.  Automatic recognition of speakers from their voices , 1976, Proceedings of the IEEE.

[4]  Sadaoki Furui,et al.  Comparison of text-independent speaker recognition methods using VQ-distortion and discrete/continuous HMMs , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  Douglas A. Reynolds,et al.  Robust text-independent speaker identification using Gaussian mixture speaker models , 1995, IEEE Trans. Speech Audio Process..

[6]  Tomi Kinnunen COMPARISON OF CLUSTERING ALGORITHMS IN SPEAKER IDENTIFICATION , 2000 .

[7]  John H. L. Hansen,et al.  Discrete-Time Processing of Speech Signals , 1993 .

[8]  Waveform Analysis Using The Fourier Transform , 2003 .

[9]  Longbiao Wang,et al.  Speaker recognition by combining MFCC and phase information , 2010, INTERSPEECH.

[10]  Tomi Kinnunen,et al.  Class-Discriminative Weighted Distortion Measure for VQ-based Speaker Identification , 2002, SSPR/SPR.

[11]  Zhuo Fang,et al.  Use Hamming window for detection the harmonic current based on instantaneous reactive power theory , 2004, The 4th International Power Electronics and Motion Control Conference, 2004. IPEMC 2004..

[12]  Jr. J.P. Campbell,et al.  Speaker recognition: a tutorial , 1997, Proc. IEEE.