Measurement Uncertainties in Speaker Recognition Evaluation | NIST

The National Institute of Standards and Technology (NIST) Speaker Recognition Evaluations (SRE) are an ongoing series of projects conducted by NIST. In the NIST SRE, speaker detection performance is measured using a detection cost function, which is defined as a weighted sum of probabilities of type I error and type II error. The sampling variability results in measurement uncertainties of the detection cost function. Hence, while evaluating and comparing the performances of speaker recognition systems, the measurement uncertainties must be taken into account. In this article, the uncertainties of detection cost functions in terms of standard errors (SE) and confidence intervals are computed using the nonparametric two-sample bootstrap methods based on our extensive bootstrap variability studies on large datasets conducted before. The data independence is assumed because the bootstrap results of SEs matched very well with the analytical results of SEs using the Mann-Whitney statistic for independent and identically distributed samples if the metric of area under a receiver operating characteristic curve is employed. Examples are provided. Index Terms -speaker recognition evaluation, biometrics, bootstrap, uncertainty, standard error, confidence interval.

[1]  Regina Y. Liu Moving blocks jackknife and bootstrap capture weak dependence , 1992 .

[2]  Sharath Pankanti,et al.  Guide to Biometrics , 2003, Springer Professional Computing.

[3]  Robert Schrek,et al.  Statistics in Research. Basic Concepts and Techniques for Research Workers , 1955 .

[4]  Robert Tibshirani,et al.  An Introduction to the Bootstrap , 1994 .

[5]  Jin Chu Wu Operational measures and accuracies of ROC curve on large fingerprint data sets , 2008 .

[6]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[7]  Jin Chu Wu,et al.  Operational Measures and Accuracies of ROC Curve on Large Fingerprint Data Sets , 2008 .

[8]  Rob J Hyndman,et al.  Sample Quantiles in Statistical Packages , 1996 .

[9]  Michael D. Garris,et al.  Nonparametric statistical data analysis of fingerprint minutiae exchange with two-finger fusion , 2006, SPIE Defense + Commercial Sensing.

[10]  David Hinkley,et al.  Bootstrap Methods: Another Look at the Jackknife , 2008 .

[11]  D. Bamber The area above the ordinal dominance graph and the area below the receiver operating characteristic graph , 1975 .

[12]  B. Efron Better Bootstrap Confidence Intervals , 1987 .

[13]  Jin Chu Wu,et al.  Studies of operational measurement of ROC curve on large fingerprint data sets using two-sample bootstrap , 2007 .

[14]  R. F.,et al.  Mathematical Statistics , 1944, Nature.

[15]  Raghu N. Kacker,et al.  Significance test in operational ROC analysis , 2010, Defense + Commercial Sensing.

[16]  W. R. Buckland Elements of Nonparametric Statistics , 1967 .

[17]  Charles L. Wilson,et al.  Nonparametric analysis of fingerprint data on large data sets , 2007, Pattern Recognit..

[18]  Michael D. Garris,et al.  Nonparametric statistical data analysis of fingerprint minutiae exchange with two-finger fusion , 2006, SPIE Defense + Commercial Sensing.

[19]  Charles L. Wilson,et al.  An empirical study of sample size in ROC-curve analysis of fingerprint data , 2006, SPIE Defense + Commercial Sensing.

[20]  P. Hall On the Number of Bootstrap Simulations Required to Construct a Confidence Interval , 1986 .