A statistical procedure to adjust for time-interval mismatch in forensic voice comparison

Abstract The present paper describes a statistical modeling procedure that was developed to account for the fact that, in a forensic voice comparison analysis conducted for a particular case, there was a long time interval between when the questioned- and known-speaker recordings were made (six years), but in the sample of the relevant population used for training and testing the forensic voice comparison system there was a short interval (hours to days) between when each of multiple recordings of each speaker was made. The present paper also includes results of empirical validation of the procedure. Although based on a particular case, the procedure has potential for wider application given that relatively long time intervals between the recording of questioned and known speakers are not uncommon in casework.

[1]  Geoffrey Stewart Morrison,et al.  The impact in forensic voice comparison of lack of calibration and of mismatched conditions between the known-speaker recording and the relevant-population sample recordings. , 2017, Forensic science international.

[2]  Geoffrey Stewart Morrison,et al.  Forensic speech science , 2019 .

[3]  James H. Elder,et al.  Probabilistic Linear Discriminant Analysis for Inferences About Identity , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[4]  M. Picheny,et al.  Comparison of Parametric Representation for Monosyllabic Word Recognition in Continuously Spoken Sentences , 2017 .

[5]  Geoffrey Stewart Morrison,et al.  Refining the relevant population in forensic voice comparison - A response to Hicks et alii (2015) The importance of distinguishing information from evidence/observations when formulating propositions. , 2016, Science & justice : journal of the Forensic Science Society.

[6]  Sadaoki Furui,et al.  Speaker-independent isolated word recognition using dynamic features of speech spectrum , 1986, IEEE Trans. Acoust. Speech Signal Process..

[7]  Florin Curelaru,et al.  Front-End Factor Analysis For Speaker Verification , 2018, 2018 International Conference on Communications (COMM).

[8]  Stan Davis,et al.  Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .

[9]  Geoffrey Stewart Morrison,et al.  Measuring the validity and reliability of forensic likelihood-ratio systems. , 2011, Science & justice : journal of the Forensic Science Society.

[10]  Niko Brümmer,et al.  Application-independent evaluation of speaker detection , 2006, Comput. Speech Lang..

[11]  Geoffrey Stewart Morrison,et al.  Tutorial on logistic-regression calibration and fusion:converting a score to a likelihood ratio , 2013, 2104.08846.

[12]  Geoffrey Stewart Morrison,et al.  Introduction to forensic voice comparison , 2019, The Routledge Handbook of Phonetics.

[13]  Didier Meuwly,et al.  A guideline for the validation of likelihood ratio methods used for forensic evidence evaluation. , 2017, Forensic science international.

[14]  S. Furui,et al.  Cepstral analysis technique for automatic speaker verification , 1981 .

[15]  John H. L. Hansen,et al.  Evaluation and calibration of short-term aging effects in speaker verification , 2015, INTERSPEECH.

[16]  Aleksandr Sizov,et al.  Unifying Probabilistic Linear Discriminant Analysis Variants in Biometric Authentication , 2014, S+SSPR.

[17]  John H. L. Hansen,et al.  Score-Aging Calibration for Speaker Verification , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[18]  Stanley J. Wenndt,et al.  The multi-session audio research project (MARP) corpus: goals, design and initial findings , 2009, INTERSPEECH.

[19]  Norman Poh,et al.  Avoiding overstating the strength of forensic evidence: Shrunk likelihood ratios/Bayes factors. , 2017, Science & justice : journal of the Forensic Science Society.

[20]  Pascal Druyts,et al.  Applying Logistic Regression to the Fusion of the NIST'99 1-Speaker Submissions , 2000, Digit. Signal Process..

[21]  Javier Ortega-Garcia,et al.  Robust estimation, interpretation and assessment of likelihood ratios in forensic speaker recognition , 2006, Comput. Speech Lang..

[22]  Doroteo Torre Toledano,et al.  Emulating DNA: Rigorous Quantification of Evidential Weight in Transparent and Testable Forensic Speaker Recognition , 2007, IEEE Transactions on Audio, Speech, and Language Processing.