Including human expertise in speaker recognition systems: report on a pilot evaluation

The 2010 NIST Speaker Recognition Evaluation (SRE10) included a test of Human Assisted Speaker Recognition (HASR) in which systems based in whole or in part on human expertise were evaluated on limited sets of trials. Participation in HASR was optional, and sites could participate in it without participating in the main evaluation of fully automatic systems. Two HASR trial sets were offered, with HASR1 including 15 trials, and HASR2 a superset of 150 trials. Results were submitted for 20 systems from 15 sites from 6 countries. The trial sets were carefully selected, by a process that combined automatic processing and human listening, to include particularly challenging trials. The performance results suggest that the chosen trials were indeed difficult, and the HASR systems did not appear to perform as well as the best fully automatic systems on these trials.

[1]  Joaquín González-Rodríguez,et al.  Calibration and weight of the evidence by human listeners. The ATVS-UAM submission to NIST HUMAN-aided speaker recognition 2010 , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[2]  Alvin F. Martin,et al.  The DET curve in assessment of detection task performance , 1997, EUROSPEECH.

[3]  Douglas E. Sturim,et al.  USSS-MITLL 2010 human assisted speaker recognition , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[4]  Thomas H. Crystal,et al.  Speaker Verification by Human Listeners: Experiments Comparing Human and Machine Performance Using the NIST 1998 Speaker Evaluation Data , 2000, Digit. Signal Process..

[5]  Solange Rossato,et al.  Speaker verification by inexperienced and experienced listeners vs. speaker verification system , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[6]  Alvin F. Martin,et al.  The NIST 2010 speaker recognition evaluation , 2010, INTERSPEECH.