Realistic Extrinsic Forensic Speaker discrimination with the Diphthong / ai/

This paper describes a discrimination experiment in forensic speaker recognition using the Australian English diphthong /a/. A two-level kernel density multivariate likelihood ratio is used as a discriminant function to investigate how well noncontemporaneous same-speaker speech samples of /a/ can be forensically discriminated from different-speaker speech samples using just this diphthong’s Fpattern at its two targets. Natural speech elicited from 25 Australian-English speaking males is extrinsically evaluated against a reference population of 166 male speakers from Bernard’s database. Comparing samples with 12 diphthong tokens each, a respectable well-calibrated EER of between ca. 8% and 10% is obtained. Forensically important aspects of the results are discussed, including an assessment of the suitability of the reference population.