论文信息 - Influence of task duration in text-independent speaker verification

Influence of task duration in text-independent speaker verification

Short duration tasks for text-independent speaker verification have received relatively little attention when compared to that directed at tasks involving many minutes of speech. In this paper we investigate verification performance on a range of durations from a few seconds to a few minutes. We begin with a state-of-the-art GMM-based system operating on a few minutes of speech per person and show that the same system is suboptimal on short (10 seconds) speech recordings. In particular we highlight that optimal frame selection exhibits a dependency on overall duration. This work sheds some light on the difficulties of transposing recent and important techniques such as SVMNAP to the short duration tasks.

Nicholas W. D. Evans | John S. D. Mason | Jean-François Bonastre | Benoit G. B. Fauve | Neil Pearson

[1] Douglas E. Sturim,et al. SVM Based Speaker Verification using a GMM Supervector Kernel and NAP Variability Compensation , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[2] Alvin F. Martin,et al. NIST Speaker Recognition Evaluation Chronicles - Part 2 , 2006, 2006 IEEE Odyssey - The Speaker and Language Recognition Workshop.

[3] Jérôme Louradour,et al. Discriminative power of transient frames in speaker recognition , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[4] Daniel Povey,et al. Secondary Classification for GMM Based Speaker Recognition , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[5] Douglas A. Reynolds,et al. The SuperSID project: exploiting high-level information for high-accuracy speaker recognition , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[6] Patrick Kenny,et al. Factor analysis simplified [speaker verification applications] , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[7] John S. D. Mason,et al. Phoneme performance in speaker recognition , 1992, ICSLP.

[8] Jason W. Pelecanos,et al. Compensation of utterance length for speaker verification , 2004, Odyssey.

[9] Sridha Sridharan,et al. Experiments in Session Variability Modelling for Speaker Verification , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[10] Jean-François Bonastre,et al. Localization and selection of speaker-specific information with statistical modeling , 2000, Speech Commun..

[11] William M. Campbell,et al. Estimating and evaluating confidence for forensic speaker recognition , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[12] William M. Campbell,et al. Advances in channel compensation for SVM speaker recognition , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..