Data dependency on measurement uncertainties in speaker recognition evaluation

The National Institute of Standards and Technology conducts an ongoing series of Speaker Recognition Evaluations (SRE). Speaker detection performance is measured using a detection cost function defined as a weighted sum of the probabilities of type I and type II errors. The sampling variability can result in measurement uncertainties. In our prior study, the data independency was assumed in using the nonparametric two-sample bootstrap method to compute the standard errors (SE) of the detection cost function based on our extensive bootstrap variability studies in ROC analysis on large datasets. In this article, the data dependency caused by multiple uses of the same subjects is taken into account. The data are grouped into target sets and non-target sets, and each set contains multiple scores. One-layer and two-layer bootstrap methods are proposed based on whether the two-sample bootstrap resampling takes place only on target sets and non-target sets, or subsequently on target scores and non-target scores within the sets, respectively. The SEs of the detection cost function using these two methods along with those with the assumption of data independency are compared. It is found that the data dependency increases both estimated SEs and the variations of SEs. Some suggestions regarding the test design are provided.

[1]  Michael D. Garris,et al.  Nonparametric statistical data analysis of fingerprint minutiae exchange with two-finger fusion , 2006, SPIE Defense + Commercial Sensing.

[2]  Charles L. Wilson,et al.  Nonparametric analysis of fingerprint data on large data sets , 2007, Pattern Recognit..

[3]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .

[4]  Michael D. Garris,et al.  Nonparametric statistical data analysis of fingerprint minutiae exchange with two-finger fusion , 2006, SPIE Defense + Commercial Sensing.

[5]  Charles L. Wilson,et al.  An empirical study of sample size in ROC-curve analysis of fingerprint data , 2006, SPIE Defense + Commercial Sensing.

[6]  Arun Ross,et al.  Biometric Technology for Human Identification IV , 2007 .

[7]  Raghu N. Kacker,et al.  Uncertainties of measures in speaker recognition evaluation , 2011, Defense + Commercial Sensing.

[8]  Raghu N. Kacker,et al.  Significance test in operational ROC analysis , 2010, Defense + Commercial Sensing.

[9]  Samy Bengio,et al.  Performance Generalization in Biometric Authentication Using Joint User-Specific and Sample Bootstraps , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Raghu N. Kacker,et al.  Further studies of bootstrap variability for ROC analysis on large datasets , 2010 .

[11]  Bernard Ostle,et al.  Statistics in Research: Basic Concepts and Techniques for Research Workers , 1990 .

[12]  David Hinkley,et al.  Bootstrap Methods: Another Look at the Jackknife , 2008 .

[13]  Rob J Hyndman,et al.  Sample Quantiles in Statistical Packages , 1996 .

[14]  Regina Y. Liu Moving blocks jackknife and bootstrap capture weak dependence , 1992 .

[15]  Jin Chu Wu Studies of operational measurement of ROC curve on large fingerprint data sets using two-sample bootstrap , 2007 .

[16]  Raghu N. Kacker,et al.  Validation of Two-Sample Bootstrap in ROC Analysis on Large Datasets Using AURC | NIST , 2010 .

[17]  Sharath Pankanti,et al.  Guide to Biometrics , 2003, Springer Professional Computing.

[18]  Raghu N. Kacker,et al.  Measures, Uncertainties, and Significance Test in Operational ROC Analysis , 2011, Journal of research of the National Institute of Standards and Technology.