A Comparison of Three Discriminant Models for Automatic Speaker Verification

Automatic Speaker Rewgnition (ASR) is composed of Automatic Speaker Identification (ASI) and Verification (ASV). In either case, it is a 4 step process consisting of speech data collection, preprocessing of the speech signal (enrolment), pattern matching and result adjudication. In the second ASR step, parametrised speech, representative of each speaker taken into account, in a given problem, is produced. The pattern matching step is performed through the use of a discrimination model which may consist of a single classifier or an architecture incorporating several classifiers. In closed set ASK the reference speaker whose speech most closely matches the unknown speech is retained. In ASV, a speaker is accepted only if matching exceeds a preset threshold. The present study compares the speaker discrimination performance of three speaker discrimination models which are the

[1]  G.R. Doddington,et al.  Speaker recognition—Identifying people by their voices , 1985, Proceedings of the IEEE.

[2]  P. J. Castellano Text-independent speaker verification with a multiple binary classifier model , 1994, Proceedings of ANZIIS '94 - Australian New Zealnd Intelligent Information Systems Conference.

[3]  Armand de Callataÿ,et al.  Natural and artificial intelligence - misconceptions about brains and neural networks , 1992 .

[4]  K.L. Brown,et al.  Text-independent speaker identification , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[5]  Man-Wai Mak,et al.  Comparing multi-layer perceptrons and radial basis functions networks in speaker recognition , 1993 .

[6]  J. S. Mason,et al.  Speaker recognition with a neural classifier , 1989 .