Improving calibration of forensic glass comparisons by considering uncertainty in feature-based elemental data

Abstract The computation of likelihood ratios (LR) to measure the weight of forensic glass evidence with LA-ICP-MS data directly in the feature space without computing any kind of score as an intermediate step is a complex problem. A probabilistic two-level modeling of the within-source and between-source variability of the glass samples is needed in order to compare the elemental profiles measured from glass recovered from a suspect or a crime scene and compared to glass samples of a known source of origin. Calibration of the likelihood ratios generated using previously reported models is essential to the realistic reporting of the value of the glass evidence comparisons. We propose models that outperform previously proposed feature-based LR models, in particular by improving the calibration of the computed LRs. We assume that the within-source variability is heavy-tailed, in order to incorporate uncertainty when the available data is scarce, as it typically happens in forensic glass comparison. Moreover, we address the complexity of the between-source variability by the use of probabilistic machine learning algorithms, namely a variational autoencoder and a warped Gaussian mixture. Our results show that the overall performance of the likelihood ratios generated by our model is superior to classical approaches, and that this improvement is due to a dramatic improvement in the calibration despite some loss in discriminating power. Moreover, the robustness of the calibration of our proposal is remarkable.

[1]  C. Neumann,et al.  Interpretation of chemical data from glass analysis for forensic purposes , 2020, Journal of Chemometrics.

[2]  Joaquin Gonzalez-Rodriguez,et al.  Reliable support: Measuring calibration of likelihood ratios. , 2013, Forensic science international.

[3]  Grzegorz Zadora,et al.  Information‐Theoretical Assessment of the Performance of Likelihood Ratio Computation Methods , 2013, Journal of forensic sciences.

[4]  I. W. Evett,et al.  Towards a uniform framework for reporting opinions in forensic science casework , 1998 .

[5]  David Lindley,et al.  A problem in forensic science , 1977 .

[6]  Zoubin Ghahramani,et al.  Probabilistic machine learning and artificial intelligence , 2015, Nature.

[7]  Tom Fawcett,et al.  PAV and the ROC convex hull , 2007, Machine Learning.

[8]  Joaquín González-Rodríguez,et al.  Calibration and weight of the evidence by human listeners. The ATVS-UAM submission to NIST HUMAN-aided speaker recognition 2010 , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[9]  Ivo Alberink,et al.  Implementation and assessment of a likelihood ratio approach for the evaluation of LA-ICP-MS evidence in forensic glass analysis. , 2017, Science & justice : journal of the Forensic Science Society.

[10]  B. Silverman Density estimation for statistics and data analysis , 1986 .

[11]  Niko Brümmer,et al.  Application-independent evaluation of speaker detection , 2006, Comput. Speech Lang..

[12]  Daniel Ramos,et al.  Gaussian Mixture Models of Between-Source Variation for Likelihood Ratio Computation from Multivariate Data , 2016, PloS one.

[13]  Amanda B. Hepler,et al.  Score-based likelihood ratios for handwriting evidence. , 2012, Forensic science international.

[14]  Franco Taroni,et al.  Statistics and the Evaluation of Evidence for Forensic Scientists , 2004 .

[15]  J. Curran,et al.  Dimensionality reduction of multielement glass evidence to calculate likelihood ratios , 2020, Journal of Chemometrics.

[16]  Sally Coulson,et al.  An interlaboratory study evaluating the interpretation of forensic glass evidence using refractive index measurements and elemental composition. , 2021, Forensic chemistry.

[17]  Neil D. Lawrence,et al.  Learning for Larger Datasets with the Gaussian Process Latent Variable Model , 2007, AISTATS.

[18]  Didier Meuwly Reconnaissance de locuteurs en sciences forensiques: l'apport d'une approche automatique , 2000 .

[19]  Mary R. Williams,et al.  Assessing evidentiary value in fire debris analysis by chemometric and likelihood ratio approaches. , 2016, Forensic science international.

[20]  Tomoharu Iwata,et al.  Warped Mixtures for Nonparametric Cluster Shapes , 2012, UAI.

[21]  Didier Meuwly,et al.  A guideline for the validation of likelihood ratio methods used for forensic evidence evaluation. , 2017, Forensic science international.

[22]  Tatiana Trejos,et al.  Sampling strategies for the analysis of glass fragments by LA-ICP-MS Part II: Sample size and sample shape considerations. , 2005, Talanta.

[23]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[24]  Daniel Ramos,et al.  Deconstructing Cross-Entropy for Probabilistic Binary Classifiers , 2018, Entropy.

[25]  JoAnn Buscaglia,et al.  Development and evaluation of a standard method for the quantitative determination of elements in float glass samples by LA-ICP-MS. , 2005, Journal of forensic sciences.

[26]  Didier Meuwly,et al.  Validation of Forensic Automatic Likelihood Ratio Methods , 2020 .

[27]  P. Weis,et al.  Establishing a match criterion in forensic comparison analysis of float glass using laser ablation inductively coupled plasma mass spectrometry , 2011 .

[28]  Colin Aitken,et al.  Evaluation of trace evidence in the form of multivariate data , 2004 .

[29]  Christopher P. Saunders,et al.  Building a unified statistical framework for the forensic identification of source problems , 2018 .

[30]  Daniel Ramos-Castro,et al.  Bayesian Strategies for Likelihood Ratio Computation in Forensic Voice Comparison with Automatic Systems , 2019, ArXiv.

[31]  Edward Chip Pollock,et al.  An inter-laboratory evaluation of LA-ICP-MS analysis of glass and the use of a database for the interpretation of glass evidence , 2018, Forensic Chemistry.

[32]  Geoffrey Stewart Morrison,et al.  Score based procedures for the calculation of forensic likelihood ratios - Scores should take account of both similarity and typicality. , 2018, Science & justice : journal of the Forensic Science Society.

[33]  Jan Hannig,et al.  Generalized fiducial factor: an alternative to the Bayes factor for forensic identification of source problems. , 2020 .

[34]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[35]  Michael E Sigman,et al.  Assessing the evidentiary value of smokeless powder comparisons. , 2016, Forensic science international.

[36]  Daniel Ramos,et al.  The use of LA-ICP-MS databases to calculate likelihood ratios for the forensic analysis of glass evidence. , 2018, Talanta.

[37]  Norman Poh,et al.  Avoiding overstating the strength of forensic evidence: Shrunk likelihood ratios/Bayes factors. , 2017, Science & justice : journal of the Forensic Science Society.

[38]  Martin Lopatka,et al.  Evaluating score- and feature-based likelihood ratio models for multivariate continuous data: applied to forensic MDMA comparison , 2015 .

[39]  C. Aitken,et al.  Expressing evaluative opinions: a position statement , 2011 .

[40]  Geoffrey Stewart Morrison,et al.  Tutorial on logistic-regression calibration and fusion:converting a score to a likelihood ratio , 2013, 2104.08846.

[41]  K. Kerber,et al.  Niger’s Child Survival Success, Contributing Factors and Challenges to Sustainability: A Retrospective Analysis , 2016, PloS one.

[42]  Carl E. Rasmussen,et al.  The Infinite Gaussian Mixture Model , 1999, NIPS.