This paper addresses the issue of close-set text-independent speaker identification from speech samples recorded over telephone. We have known that the speaker identification performance variability can be attributed to many factors. One major factor is the inherent differences in the recognizability of different speakers. In speaker recognition systems such differences are characterized by the use of animal names for different types of speakers. In this paper we use lambs to refer to those speakers who are particularly easy to imitate in our closeset text-independent speaker identification system. That is, other speakers are much more likely to be recognized as these lamb speakers when they cannot be correctly recognized. Lambs adversely affect our close-set text-independent speaker identification performance a lot. In this paper we describe a naive de-lambing method to deal with these lamb speakers so as to improve our system performance. The speech data of our close-set speaker identification system is from the NIST 1999 Speaker Recognition Evaluation. Our experiments were conducted on 230 male speakers. We tried both testing from same telephone channels and sessions with training and different telephone channels and sessions with training for each speaker. Combined, the method developed in this paper result in a 15% relative improvement on the close-set 45-second training 10-second testing condition.
[1]
Allen Gersho,et al.
Vector quantization and signal compression
,
1991,
The Kluwer international series in engineering and computer science.
[2]
Aaron E. Rosenberg,et al.
Evaluation of a vector quantization talker recognition system in text independent and text dependent modes
,
1987
.
[3]
Til T. Phan,et al.
Text-Independent Speaker Identification
,
1999
.
[4]
Günther Palm,et al.
A new codebook training algorithm for VQ-based speaker recognition
,
1997,
1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[5]
Biing-Hwang Juang,et al.
A vector quantization approach to speaker recognition
,
1985,
ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.
[6]
Douglas A. Reynolds,et al.
SHEEP, GOATS, LAMBS and WOLVES A Statistical Analysis of Speaker Performance in the NIST 1998 Speaker Recognition Evaluation
,
1998
.
[7]
Aaron E. Rosenberg,et al.
On the use of instantaneous and transitional spectral information in speaker recognition
,
1986,
ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.