Very large population text-independent speaker identification using transformation enhanced multi-grained models

Presents results on speaker identification with a population size of over 10000 speakers. Speaker modeling is accomplished via our transformation enhanced multigrained models. Pursuing two goals, the first is to study the performance of a number of different systems within the modeling framework of multi-grained models. The second is to analyze performance as a function of population size. We show that the most complex models within the framework perform the best and demonstrate that, in approximation, the identification error rate scales linearly with the log of the population size for the described system. Further, we develop a candidate rejection technique based on our analysis of the system performance which indicates a low confidence in the identity chosen.

[1]  Stéphane H. Maes,et al.  A Speech Biometrics System with Multi- Grained Speaker Modeling , 2000, KONVENS.

[2]  Stéphane H. Maes,et al.  Transformation enhanced multi-grained modeling for text-independent speaker recognition , 2000, INTERSPEECH.

[3]  Douglas A. Reynolds,et al.  Robust text-independent speaker identification using Gaussian mixture speaker models , 1995, IEEE Trans. Speech Audio Process..

[4]  Stéphane H. Maes,et al.  Conversational biometrics , 1999, EUROSPEECH.

[5]  Ramesh A. Gopinath,et al.  Maximum likelihood modeling with Gaussian distributions for classification , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).