Database selection for forensic voice comparison

Defining the relevant population to sample is an important issue in data-based implementation of the likelihood-ratio framework for forensic voice comparison. We present a logical argument that because an investigator or prosecutor only submits suspect and offender recordings for forensic analysis if they sound sufficiently similar to each other, the appropriate defense hypothesis for the forensic scientist to adopt will usually be that the suspect is not the speaker on the offender recording but is a member of a population of speakers who sound sufficiently similar that an investigator or prosecutor would submit recordings of these speakers for forensic analysis. We propose a procedure for selecting background, development, and test databases using a panel of human listeners, and empirically test an automatic procedure inspired by the above. Although the automatic procedure is not entirely consistent with the logical arguments and human-listener procedure, it serves as a proof of concept for the importance of database selection. A forensicvoice-comparison system using the automatic database-selection procedure outperformed systems with random database

[1]  Franco Taroni,et al.  Statistics and the Evaluation of Evidence for Forensic Scientists , 2004 .

[2]  F Taroni,et al.  Recent misconceptions about the 'database search problem': a probabilistic analysis using Bayesian networks. , 2011, Forensic science international.

[3]  Philip Harrison,et al.  The UK position statement on forensic speaker comparison; a rejoinder to Rose and Morrison , 2010 .

[4]  David A. van Leeuwen,et al.  Fusion of Heterogeneous Speaker Recognition Systems in the STBU Submission for the NIST Speaker Recognition Evaluation 2006 , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[5]  John Peter Gibbons Addressing social issues through linguistic evidence , 2014 .

[6]  Geoffrey Stewart Morrison,et al.  Forensic voice comparison and the paradigm shift. , 2009, Science & justice : journal of the Forensic Science Society.

[7]  Javier Ortega-Garcia,et al.  Robust estimation, interpretation and assessment of likelihood ratios in forensic speaker recognition , 2006, Comput. Speech Lang..

[8]  Doroteo Torre Toledano,et al.  Emulating DNA: Rigorous Quantification of Evidential Weight in Transparent and Testable Forensic Speaker Recognition , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[9]  Michael Jessen,et al.  Automatic Forensic Voice Comparison Using Recording Adapted Background Models , 2010 .

[10]  Tharmarajah Thiruvaran,et al.  Forensic Voice Comparison Using Chinese /iau/ , 2011, ICPhS.

[11]  Yun Lei,et al.  Effective background data selection in SVM speaker recognition for unseen test environment: More is not always better , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[12]  Julian Fiérrez,et al.  Speaker verification using speaker- and test-dependent fast score normalization , 2007, Pattern Recognit. Lett..

[13]  Philip Rose,et al.  The Intrinsic Forensic Discriminatory Power of Diphthongs , 2006 .

[14]  John H. L. Hansen,et al.  A Study on Universal Background Model Training in Speaker Verification , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[15]  Julien Epps,et al.  Estimating the Precision of the Likelihood-Ratio Output of a Forensic-Voice-Comparison System , 2010, Odyssey.

[16]  Phil Rose,et al.  A response to the UK position statement on forensic speaker comparison , 2009 .

[17]  Haizhou Li,et al.  An overview of text-independent speaker recognition: From features to supervectors , 2010, Speech Commun..

[18]  Sridha Sridharan,et al.  Improved SVM speaker verification through data-driven background dataset collection , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[19]  Daniel Ramos Forensic evaluation of the evidence using automatic speaker recognition systems , 2014 .

[20]  Andrzej Drygajlo,et al.  Scoring and direct methods for the interpretation of evidence in forensic speaker recognition , 2004, INTERSPEECH.

[21]  Didier Meuwly Reconnaissance de locuteurs en sciences forensiques: l'apport d'une approche automatique , 2000 .

[22]  Geoffrey Stewart Morrison,et al.  Measuring the validity and reliability of forensic likelihood-ratio systems. , 2011, Science & justice : journal of the Forensic Science Society.

[23]  Jr. J.P. Campbell,et al.  Speaker recognition: a tutorial , 1997, Proc. IEEE.

[24]  Geoffrey Stewart Morrison,et al.  A comparison of procedures for the calculation of forensic likelihood ratios from acoustic-phonetic data: Multivariate kernel density (MVKD) versus Gaussian mixture model-universal background model (GMM-UBM) , 2011, Speech Commun..

[25]  Hesham M. Fahmy Introduction to Statistics for Forensic Scientists , 2007, Technometrics.

[26]  Roland Auckenthaler,et al.  Score Normalization for Text-Independent Speaker Verification Systems , 2000, Digit. Signal Process..

[27]  Phil Rose,et al.  Forensic Voice Comparison with Japanese Vowel Acoustics - A Likelihood Ratio-based Approach Using Segmental Cepstra , 2011, ICPhS.

[28]  Colin Aitken,et al.  Evaluation of trace evidence in the form of multivariate data , 2004 .

[29]  Julien Epps,et al.  An Issue in the Calculation of Logistic-Regression Calibration and Fusion Weights for Forensic Voice Comparison , 2010 .

[30]  Pascal Druyts,et al.  Applying Logistic Regression to the Fusion of the NIST'99 1-Speaker Submissions , 2000, Digit. Signal Process..

[31]  Sridha Sridharan,et al.  Feature warping for robust speaker verification , 2001, Odyssey.

[32]  David A. van Leeuwen,et al.  An Introduction to Application-Independent Evaluation of Speaker Recognition Systems , 2007, Speaker Classification.

[33]  Des Spence Interpreting the evidence , 2002, BMJ : British Medical Journal.

[34]  Claude Roux,et al.  Statistics and the Evaluation of Evidence for Forensic Scientists, by Colin G. G. Aitken and Franco Taroni 2nd edition. John Wiley and Sons, 2004. , 2006 .

[35]  I. Elamvazuthi,et al.  Voice Recognition Algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) Techniques , 2010, ArXiv.

[36]  Cedric Neumann,et al.  Quantifying the weight of evidence from a forensic fingerprint comparison: a new paradigm , 2012 .

[37]  Niko Brümmer,et al.  Application-independent evaluation of speaker detection , 2006, Comput. Speech Lang..

[38]  Douglas A. Reynolds,et al.  Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[39]  C Champod,et al.  Establishing the most appropriate databases for addressing source level propositions. , 2004, Science & justice : journal of the Forensic Science Society.

[40]  Geoffrey Stewart Morrison,et al.  Effects of telephone transmission on the performance of formant-trajectory-based forensic voice comparison - Female voices , 2013, Speech Commun..

[41]  Hermann J. Künzel Some general phonetic and forensic aspects of speaking tempo , 2013 .

[42]  Jia Liu,et al.  Multiple Background Models for Speaker Verification , 2010, Odyssey.

[43]  Douglas A. Reynolds,et al.  Comparison of background normalization methods for text-independent speaker verification , 1997, EUROSPEECH.

[44]  Sadaoki Furui,et al.  Speaker-independent isolated word recognition using dynamic features of speech spectrum , 1986, IEEE Trans. Acoust. Speech Signal Process..

[45]  Didier Meuwly,et al.  The inference of identity in forensic speaker recognition , 2000, Speech Commun..

[46]  Peter French,et al.  Position Statement concerning use of impressionistic likelihood terms in forensic speaker comparison cases, with a foreword by Peter French & Philip Harrison , 2007 .

[47]  Michael Jessen,et al.  Forensic phonetics , 1991, Journal of Linguistics.