Chemometrics comes to court: evidence evaluation of chem–bio threat agent attacks

Forensic statistics is a well‐established scientific field whose purpose is to statistically analyze evidence in order to support legal decisions. It traditionally relies on methods that assume small numbers of independent variables and multiple samples. Unfortunately, such methods are less applicable when dealing with highly correlated multivariate data sets such as those generated by emerging high throughput analytical technologies. Chemometrics is a field that has a wealth of methods for the analysis of such complex data sets, so it would be desirable to combine the two fields in order to identify best practices for forensic statistics in the future. This paper provides a brief introduction to forensic statistics and describes how chemometrics could be integrated with its established methods to improve the evaluation of evidence in court.

[1]  Bernard W. Silverman,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[2]  Colin Aitken,et al.  Evaluation of trace evidence in the form of multivariate data , 2004 .

[3]  R. Knight,et al.  Forensic identification using skin bacterial communities , 2010, Proceedings of the National Academy of Sciences.

[4]  Charles Bouveyron,et al.  Probabilistic model‐based discriminant analysis and clustering methods in chemometrics , 2013 .

[5]  R. Boqué,et al.  Calculation of the reliability of classification in discriminant partial least-squares binary classification , 2009 .

[6]  Amy Wilson,et al.  The evaluation of evidence relating to traces of cocaine on banknotes. , 2014, Forensic science international.

[7]  C. Aitken,et al.  Statistics and the Evaluation of Evidence for Forensic Scientists: Aitken/Statistics and the Evaluation of Evidence for Forensic Scientists , 2005 .

[8]  K. Wahl,et al.  Integration of gas chromatography mass spectrometry methods for differentiating ricin preparation methods. , 2012, The Analyst.

[9]  Allan H. Seheult,et al.  On a problem in forensic science , 1978 .

[10]  Matthieu Schmittbuhl,et al.  Probabilistic evaluation of handwriting evidence: likelihood ratio for authorship , 2008 .

[11]  Gary Napier,et al.  A composite Bayesian hierarchical model of compositional data with zeros , 2015 .

[12]  M. C. Jones,et al.  A Brief Survey of Bandwidth Selection for Density Estimation , 1996 .

[13]  Jae Won Lee,et al.  An extensive comparison of recent classification tools applied to microarray data , 2004, Comput. Stat. Data Anal..

[14]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[15]  U. Edlund,et al.  Visualization of GC/TOF-MS-based metabolomics data for identification of biochemically interesting compounds using OPLS class models. , 2008, Analytical chemistry.

[16]  Eric P. Nawrocki,et al.  An improved Greengenes taxonomy with explicit ranks for ecological and evolutionary analyses of bacteria and archaea , 2011, The ISME Journal.

[17]  Rickard Knutsson,et al.  The need for high-quality whole-genome sequence databases in microbial forensics. , 2013, Biosecurity and bioterrorism : biodefense strategy, practice, and science.

[18]  Alexander Statnikov,et al.  A comprehensive evaluation of multicategory classification methods for microbiomic data , 2013, Microbiome.

[19]  J. Suykens,et al.  A tutorial on support vector machine-based methods for classification problems in chemometrics. , 2010, Analytica chimica acta.

[20]  M. Rantalainen,et al.  OPLS discriminant analysis: combining the strengths of PLS‐DA and SIMCA classification , 2006 .

[21]  Rob Knight,et al.  Bayesian community-wide culture-independent microbial source tracking , 2011, Nature Methods.

[22]  S. Wold,et al.  Orthogonal projections to latent structures (O‐PLS) , 2002 .

[23]  Jukka Corander,et al.  Bayesian predictive modeling and comparison of oil samples , 2014 .

[24]  James O. Berger Statistical Decision Theory , 1980 .

[25]  Paul Geladi,et al.  Interactive variable selection (IVS) for PLS. Part II: Chemical applications , 1995 .

[26]  A. Nordgaard,et al.  Assessment of forensic findings when alternative explanations have different likelihoods-"Blame-the-brother"-syndrome. , 2012, Science & justice : journal of the Forensic Science Society.

[27]  Grzegorz Zadora,et al.  Evaluation of glass samples for forensic purposes — An application of likelihood ratios and an information–theoretical approach , 2010 .

[28]  R. Knight,et al.  Supervised classification of human microbiota. , 2011, FEMS microbiology reviews.

[29]  William A. Walters,et al.  Global patterns of 16S rRNA diversity at a depth of millions of sequences per sample , 2010, Proceedings of the National Academy of Sciences.

[30]  Cory Doctorow Big data: Welcome to the petacentre , 2008, Nature.

[31]  Charles Bouveyron,et al.  Model-based clustering of high-dimensional data: A review , 2014, Comput. Stat. Data Anal..

[32]  G Zadora,et al.  Evaluation of evidence value of glass fragments by likelihood ratio and Bayesian Network approaches. , 2009, Analytica chimica acta.

[33]  Sargur N. Srihari,et al.  Latent Fingerprint Rarity Analysis in Madrid Bombing Case , 2010, ICWF.

[34]  Leon Hirsch Weight Of Evidence For Forensic Dna Profiles , 2016 .

[35]  Rasmus Bro,et al.  Variable selection in regression—a tutorial , 2010 .

[36]  James O. Berger,et al.  STATISTICAL DECISION THEORY: FOUNDATIONS, CONCEPTS, AND METHODS , 1984 .

[37]  William A. Walters,et al.  QIIME allows analysis of high-throughput community sequencing data , 2010, Nature Methods.

[38]  Grzegorz Zadora,et al.  Transformations for compositional data with zeros with an application to forensic evidence evaluation , 2011 .

[39]  Grzegorz Zadora,et al.  Information‐Theoretical Assessment of the Performance of Likelihood Ratio Computation Methods , 2013, Journal of forensic sciences.

[40]  P. J. Green,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[41]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[42]  Agnieszka Martyna,et al.  Wine authenticity verification as a forensic problem: an application of likelihood ratio test to label verification. , 2014, Food chemistry.