Computational aspects of DNA mixture analysis

Statistical analysis of DNA mixtures for forensic identification is known to pose computational challenges due to the enormous state space of possible DNA profiles. We describe a general method for computing the expectation of a product of discrete random variables using auxiliary variables and probability propagation in a Bayesian network. We propose a Bayesian network representation for genotypes, allowing computations to be performed locally involving only a few alleles at each step. Exploiting appropriate auxiliary variables in combination with this representation allows efficient computation of the likelihood function and prediction of genotypes of unknown contributors. Importantly, we exploit the computational structure to introduce a novel set of diagnostic tools for assessing the adequacy of the model for describing a particular dataset.

[1]  John Cocke,et al.  Optimal decoding of linear codes for minimizing symbol error rate (Corresp.) , 1974, IEEE Trans. Inf. Theory.

[2]  David Lindley,et al.  A problem in forensic science , 1977 .

[3]  M. Yannakakis Computing the Minimum Fill-in is NP^Complete , 1981 .

[4]  A. P. Dawid,et al.  Applications of a general propagation algorithm for probabilistic expert systems , 1992 .

[5]  A. Dawid,et al.  On Testing the Validity of Sequential Probability Forecasts , 1993 .

[6]  Michael I. Jordan,et al.  Probabilistic Networks and Expert Systems , 1999 .

[7]  Dw Van Boxel,et al.  Probabilistic Expert Systems for Forensic Inference from Genetic Markers , 2002 .

[8]  A. Dawid,et al.  Probabilistic expert systems for DNA mixture profiling. , 2003, Theoretical population biology.

[9]  P. Gill,et al.  PENDULUM--a guideline-based approach to the interpretation of STR mixtures. , 2005, Forensic science international.

[10]  D. Balding Weight-of-Evidence for Forensic DNA Profiles , 2005 .

[11]  Jonathan Whitaker,et al.  Interpretation of complex DNA profiles using empirical models and a method to measure their robustness. , 2008, Forensic science international. Genetics.

[12]  J. Mortera,et al.  Sensitivity of inferences in forensic genetics to assumptions about founding genes , 2009, 0908.2862.

[13]  T. Tvedebrink,et al.  Evaluating the weight of evidence by using quantitative short tandem repeat data in DNA mixtures , 2010 .

[14]  S L Lauritzen,et al.  Probabilistic expert systems for handling artifacts in complex DNA mixtures. , 2011, Forensic science international. Genetics.

[15]  S. Lauritzen,et al.  Estimation of parameters in DNA mixture analysis , 2011, 1108.1884.

[16]  D. Balding,et al.  Evaluating forensic DNA profiles using peak heights, allowing for multiple donors, allelic dropout and stutters. , 2013, Forensic science international. Genetics.

[17]  D. Balding Evaluation of mixed-source, low-template DNA profiles in forensic science , 2013, Proceedings of the National Academy of Sciences.

[18]  J. Mortera,et al.  Analysis of forensic DNA mixtures with artefacts , 2013, 1302.4404.

[19]  David S. Moore,et al.  Statistics in Practice , 2014 .