Text Independent Speaker Recognition Using Mixed MFCC and WOCOR Methods in Persian Language

 Abstract—Voiced speech is usually used for speaker recognition. But in text-independent speaker recognition it would be better to use special voiced letters which are appeared in all words. In this paper, we have employed the certain letters for speaker recognition. As we know in Persian language, each consonant letter must be followed by a vowel letter. These are A I U – æe – ɔ:. Therefore it is enough for text-independent speaker recognition, to find and use these letters. For speaker recognition we employ both vocal source excitation signal and vocal tract system of these letters. Also we use the most prevalent feature parameters for speech/speaker recognition, that is the Mel-frequency cepstral coefficients (MFCC) and for speaker recognition by using vocal source excitation, we have employed Wavelet Octave Coefficients of Residuals (WOCOR). Since these methods are highly sensitive to noise, we use spectral subtraction method to cancel the noise.

[1]  Nengheng Zheng,et al.  Integration of Complementary Acoustic Features for Speaker Recognition , 2007, IEEE Signal Processing Letters.

[2]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[3]  Wai Nang Chan,et al.  Discrimination Power of Vocal Source and Vocal Tract Related Features for Speaker Segmentation , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[4]  Sadaoki Furui,et al.  Speaker recognition , 1997, Scholarpedia.

[5]  Ning Wang,et al.  Robust Speaker Recognition Using Denoised Vocal Source and Vocal Tract Features , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[6]  Heinz Hügli,et al.  Usefulness of the LPC-residue in text-independent speaker verification , 1995, Speech Commun..

[7]  Douglas A. Reynolds,et al.  Speaker identification and verification using Gaussian mixture speaker models , 1995, Speech Commun..

[8]  Jr. J.P. Campbell,et al.  Speaker recognition: a tutorial , 1997, Proc. IEEE.

[9]  Ke Chen,et al.  Personalize Mobile Access By Speaker Authentication , 2002 .

[10]  David Zhang Biometric solutions : for authentication in an E-world , 2002 .

[11]  Aggelos K. Katsaggelos,et al.  Audio-Visual Biometrics , 2006, Proceedings of the IEEE.

[12]  D. O'Shaughnessy,et al.  Speaker recognition , 1986, IEEE ASSP Magazine.

[13]  Nengheng Zheng,et al.  Time -frequency analysis of vocal source signal for speaker recognition , 2004, INTERSPEECH.

[14]  Marcos Faúndez-Zanuy,et al.  Investigation on LP-residual representations for speaker identification , 2009, Pattern Recognit..