A Speaker Recognition Method based on Personal Identification Voice and Trapezoid Fuzzy Similarity

A text-dependent speaker recognition method is proposed using trapezoidal fuzzy similarity function to measure the similarity of voice features between a test user and the registered speaker who has nearest distance. The trapezoidal fuzzy similarity function is constructed based on three-time data recorded during enrolment process as personal identification voice (PIV) and statistical data of an individual recorded many times in a long time period to cover the intra-variation. A set of acoustic voice features is also introduced to present some general speaker and text dependent characteristics that are effective for modeling PIV, thus allowing to capture the intervariation from one speaker to another. The experimental results on 24 speakers recorded in four different sessions show that, without false acceptation, the proposed system can decrease 30.05% of false rejection cases, compared to the traditional nearest neighbor approach. The focus of this work is on applications which require fast processing and few burdens for users.

[1]  Ke Chen,et al.  Capture interspeaker information with a neural network for speaker identification , 2002, IEEE Trans. Neural Networks.

[2]  Kaoru Hirota,et al.  Fuzzy Hierarchical Pattern Recognition for Robotics Applications , 1997, Australian Joint Conference on Artificial Intelligence.

[3]  Wasfy B. Mikhael,et al.  An improved speaker identification technique employing multiple representations of the linear prediction coefficients , 2003, Proceedings of the 2003 International Symposium on Circuits and Systems, 2003. ISCAS '03..

[4]  Sadaoki Furui,et al.  Comparison of text-independent speaker recognition methods using VQ-distortion and discrete/continuous HMM's , 1994, IEEE Trans. Speech Audio Process..

[5]  Lawrence G. Bahler,et al.  Voice identification using nearest-neighbor distance measure , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  Michael Wagner,et al.  Fuzzy nearest prototype classifier applied to speaker identitification , 1999 .

[7]  Dat Tran,et al.  Fuzzy C-Means Clustering-Based Speaker Verification , 2002, AFSS.

[8]  M. Sambur,et al.  Selection of acoustic features for speaker identification , 1975 .

[9]  B.S. Atal,et al.  Automatic recognition of speakers from their voices , 1976, Proceedings of the IEEE.

[10]  Ke Chen,et al.  Capture inter-speaker information with a neural network for speaker identification , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[11]  Saifur Rahman,et al.  SPEAKER IDENTIFICATION USING MEL FREQUENCY CEPSTRAL COEFFICIENTS , 2004 .

[12]  J. Oglesby,et al.  Radial basis function networks for speaker recognition , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[13]  Jr. J.P. Campbell,et al.  Speaker recognition: a tutorial , 1997, Proc. IEEE.

[14]  Biing-Hwang Juang,et al.  A vector quantization approach to speaker recognition , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[15]  A. Gray,et al.  Long-term feature averaging for speaker recognition , 1977 .

[16]  Herbert Gish,et al.  Investigation of text-independent speaker indentification over telephone channels , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[17]  Stephen A. Zahorian,et al.  Text-independent talker identification with neural networks , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[18]  Kaoru Hirota The bounded variation quantity (B.V.Q.) and its application to feature extraction , 1982, Pattern Recognit..

[19]  Younès Bennani,et al.  On the use of TDNN-extracted features information in talker identification , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[20]  Douglas A. Reynolds,et al.  Robust text-independent speaker identification using Gaussian mixture speaker models , 1995, IEEE Trans. Speech Audio Process..