Evaluation of statistical methods for the analysis of forensic DNA mixtures

Analysis of forensic DNA mixtures recovered from crime scenes is one of the most challengingtasks in forensic science. DNA mixture raise two main questions: “how many contributors arethere” and “what are the genotypes of the contributing individuals?” The genetic characterizationalone of such samples does not always answer these questions. In fact, whenever more thantwo alleles are observed at a given locus, several distinct genotypic combinations are plausiblefor the unknown contributors to the sample, and it is not possible to determine the number ofthese contributors with absolute certainty. Besides, the presence of anomalies related to DNAtyping techniques, such as contamination or allele loss (drop-out), can further complicate theanalysis.Numerous statistical developments facilitating DNA mixtures interpretation were proposed,but they did not receive the expected success in the forensic community. The main explanationfor this is that these methods are not validated for forensic casework.In order to achieve this validation criterion, the methods must undergo a rigorous evaluationstep. The latter raises two questions: i) how methods should be evaluated? and ii) whattools can be used to conduct evaluation studies? In this thesis we attempt to answer bothquestions. First, we evaluate methods dedicated to two key issues, the estimation of the numberof contributors to DNA mixtures and the estimation of drop-out probabilities. Second, wepropose an “open-source” software that offers a number of functionalities dedicated to facilitatingmethod evaluation through the simulation of data commonly encountered in forensic settings.This thesis aims to provide a concrete answer to the issues raised by forensic DNA mixtures,by providing a methodology for method evaluation and by offering necessary tools to enablemethod evaluation.

[1]  S WRIGHT,et al.  Genetical structure of populations. , 1950, Nature.

[2]  R. Chakraborty,et al.  Heterozygote deficiency, population substructure and their implications in DNA fingerprinting , 2004, Human Genetics.

[3]  C. Aitken Statistics in Forensic Science , 2010 .

[4]  D. Comas,et al.  2006 GEP-ISFG collaborative exercise on mtDNA: reflections about interpretation, artefacts, and DNA mixtures. , 2008, Forensic science international. Genetics.

[5]  J. Curran A MCMC method for resolving two person mixtures. , 2008, Science & justice : journal of the Forensic Science Society.

[6]  B Budowle,et al.  Validation of short tandem repeats (STRs) for forensic usage: performance testing of fluorescent multiplex STR systems and analysis of authentic and simulated forensic samples. , 2001, Journal of forensic sciences.

[7]  Carissa M Krane,et al.  Empirical analysis of the STR profiles resulting from conceptual mixtures. , 2005, Journal of forensic sciences.

[8]  Franco Taroni,et al.  Statistics and the Evaluation of Evidence for Forensic Scientists , 2004 .

[9]  J. Buckleton,et al.  Forensic DNA Evidence Interpretation , 2004 .

[10]  J. Schumm,et al.  General approach to analysis of polymorphic short tandem repeat loci. , 1996, BioTechniques.

[11]  James Curran,et al.  The low-template-DNA (stochastic) threshold--its determination relative to risk analysis for national DNA databases. , 2009, Forensic science international. Genetics.

[12]  C. Strom,et al.  Use of nested PCR to identify charred human remains and minute amounts of blood. , 1998, Journal of forensic sciences.

[13]  Response to Comment on “Low copy number typing has yet to achieve “general acceptance”” (Budowle et al., 2009. Forensic Sci. Int. Genetics: Supplement Series 2, 551–552) by Theresa Caragine, Mechthild Prinz , 2011 .

[14]  E. Petit,et al.  Noninvasive population genetics: a review of sample source, diet, fragment length and microsatellite motif effects on amplification success and genotyping error rates , 2006, Conservation Genetics.

[15]  S L Lauritzen,et al.  Identification and separation of DNA mixtures using peak area information. , 2007, Forensic science international.

[16]  F Taroni,et al.  Evaluation and presentation of forensic DNA evidence in European laboratories. , 2002, Science & justice : journal of the Forensic Science Society.

[17]  B S Weir Independence of VNTR alleles defined as floating bins. , 1992, American journal of human genetics.

[18]  Peter Gill,et al.  Towards understanding the effect of uncertainty in the number of contributors to DNA stains. , 2007, Forensic science international. Genetics.

[19]  Ranajit Chakraborty,et al.  Interpreting DNA evidence , 2000 .

[20]  James Curran,et al.  LoComatioN: a software tool for the analysis of low copy number DNA profiles. , 2007, Forensic science international.

[21]  P Taberlet,et al.  Reliable genotyping of samples with very low DNA quantities using PCR. , 1996, Nucleic acids research.

[22]  C. Caskey,et al.  DNA typing and genetic mapping with trimeric and tetrameric tandem repeats. , 1991, American journal of human genetics.

[23]  Franco Taroni,et al.  How the probability of a false positive affects the value of DNA evidence. , 2003, Journal of forensic sciences.

[24]  A J Jeffreys,et al.  1992 William Allan Award address. , 1993, American journal of human genetics.

[25]  D J Balding,et al.  DNA profile match probability calculation: how to allow for population stratification, relatedness, database selection and single bands. , 1994, Forensic science international.

[26]  Hinda Haned,et al.  Estimating the Number of Contributors to Forensic DNA Mixtures: Does Maximum Likelihood Perform Better Than Maximum Allele Count? , 2011, Journal of forensic sciences.

[27]  Peter Donnelly,et al.  Assessing population differentiation and isolation from single‐nucleotide polymorphism data , 2002 .

[28]  N Risch,et al.  Ethnic differentiation at VNTR loci, with special reference to forensic applications. , 1992, American journal of human genetics.

[29]  B S Weir,et al.  Independence of VNTR alleles defined as fixed bins. , 1992, Genetics.

[30]  Ross Ihaka,et al.  Gentleman R: R: A language for data analysis and graphics , 1996 .

[31]  N. L. Johnson,et al.  Continuous Univariate Distributions. , 1995 .

[32]  B Devlin,et al.  On the probability of matching DNA fingerprints. , 1992, Science.

[33]  A. Jeffreys,et al.  Individual-specific ‘fingerprints’ of human DNA , 1985, Nature.

[34]  James Curran,et al.  A discussion of the merits of random man not excluded and likelihood ratios. , 2008, Forensic science international. Genetics.

[35]  D. Haydon,et al.  Maximum-Likelihood Estimation of Allelic Dropout and False Allele Error Rates From Microsatellite Genotypes in the Absence of Reference Data , 2007, Genetics.

[36]  Craig R. Miller,et al.  Assessing allelic dropout and genotype reliability using maximum likelihood. , 2002, Genetics.

[37]  Natasha Gilbert,et al.  DNA'S IDENTITY CRISIS , 2010 .

[38]  E S Lander,et al.  Research on DNA typing catching up with courtroom application. , 1991, American journal of human genetics.

[39]  Hinda Haned,et al.  Forensim: an open-source initiative for the evaluation of statistical methods in forensic genetics. , 2011, Forensic science international. Genetics.

[40]  B. Weir,et al.  Interpreting DNA mixtures. , 1997, Journal of forensic sciences.

[41]  John Buckleton,et al.  Is the 2p rule always conservative? , 2006, Forensic science international.

[42]  Swee Lay Thein,et al.  Hypervariable ‘minisatellite’ regions in human DNA , 1985, Nature.

[43]  W. Bär,et al.  Biostatistical evaluation of mixed stains with contributors of different ethnic origin , 1999, International Journal of Legal Medicine.

[44]  W R Mayr,et al.  DNA commission of the International Society of Forensic Genetics: Recommendations on the interpretation of mixtures. , 2006, Forensic science international.

[45]  L. Keller,et al.  Back to the future: museum specimens in population genetics. , 2007, Trends in ecology & evolution.

[46]  Alun Anderson,et al.  DNA fingerprinting on trial , 1989, Nature.

[47]  J A Lambert,et al.  Taking account of peak areas when interpreting mixed DNA profiles. , 1998, Journal of forensic sciences.

[48]  B Herrmann,et al.  Optimized DNA extraction to improve reproducibility of short tandem repeat genotyping with highly degraded DNA as target , 1999, Electrophoresis.

[49]  Stefano Tarantola,et al.  A Worked Example , 2004 .

[50]  A. Dawid,et al.  Probabilistic expert systems for DNA mixture profiling. , 2003, Theoretical population biology.

[51]  A. Urquhart,et al.  DNA fingerprinting from single cells , 1997, Nature.

[52]  D. Hartl,et al.  Population genetics in forensic DNA typing. , 1991, Science.

[53]  Terence P. Speed,et al.  Discussion on the meeting on ‘Statistical modelling and analysis of genetic data’ , 2002 .

[54]  Jay D. Aronson,et al.  Genetic Witness: Science, Law, and Controversy in the Making of DNA Profiling , 2007 .

[55]  John Buckleton,et al.  A universal strategy to interpret DNA profiles that does not require a definition of low-copy-number. , 2010, Forensic science international. Genetics.

[56]  Peter Gill,et al.  Low Copy Number , 2004 .

[57]  Susan A Greenspoon,et al.  Validation and implementation of the PowerPlex 16 BIO System STR multiplex for forensic casework. , 2004, Journal of forensic sciences.

[58]  Edward J. Rykiel,et al.  Testing ecological models: the meaning of validation , 1996 .

[59]  R. V. Oorschot,et al.  DNA fingerprints from fingerprints , 1997, Nature.

[60]  S N Austad Forensic DNA typing. , 1992, Science.

[61]  M W Perlin,et al.  Linear mixture analysis: a mathematical approach to resolving mixed DNA samples. , 2001, Journal of forensic sciences.

[62]  Wing K. Fung,et al.  Interpreting forensic DNA mixtures: allowing for uncertainty in population substructure and dependence , 2000 .

[63]  James Curran,et al.  A graphical simulation model of the entire DNA process associated with the analysis of short tandem repeat loci , 2005, Nucleic acids research.

[64]  R. Nichols,et al.  Interpreting DNA Evidence: Statistical Genetics for Forensic Scientists , 1999, Heredity.

[65]  B. Madea,et al.  Validation of the multiplex kit genRESMPX-2 for forensic casework analysis , 2003, International Journal of Legal Medicine.

[66]  Bruce Budowle,et al.  Mixture Interpretation: Defining the Relevant Features for Guidelines for the Assessment of Mixed DNA Profiles in Forensic Casework * , 2009, Journal of forensic sciences.

[67]  J Buckleton,et al.  An investigation of the rigor of interpretation rules for STRs derived from less than 100 pg of DNA. , 2000, Forensic science international.

[68]  Peter M Vallone,et al.  Allele frequencies for 15 autosomal STR loci on U.S. Caucasian, African American, and Hispanic populations. , 2003, Journal of forensic sciences.

[69]  Bruce S. Weir,et al.  DNA statistics in the Simpson matter , 1995, Nature Genetics.

[70]  W. Fung,et al.  Interpreting DNA mixtures with the presence of relatives , 2003, International Journal of Legal Medicine.

[71]  J M Curran,et al.  Interpreting DNA mixtures in structured populations. , 1999, Journal of forensic sciences.

[72]  Vinod P. Shah,et al.  Bioanalytical Method Validation—A Revisit with a Decade of Progress , 2000, Pharmaceutical Research.

[73]  P. Gill,et al.  PENDULUM--a guideline-based approach to the interpretation of STR mixtures. , 2005, Forensic science international.

[74]  J Buckleton,et al.  Interpreting simple STR mixtures using allele peak areas. , 1998, Forensic science international.

[75]  Øivind Skare,et al.  Identification of distant family relationships , 2009, Bioinform..

[76]  Wing K. Fung,et al.  Interpreting DNA Mixtures with Related Contributors in Subdivided Populations , 2004 .

[77]  J M Curran,et al.  Interpretation of repeat measurement DNA evidence allowing for multiple contributors and population substructure. , 2005, Forensic science international.

[78]  Bruce Budowle,et al.  Low copy number typing has yet to achieve “general acceptance” , 2009 .

[79]  T Egeland,et al.  Beyond traditional paternity and identification cases. Selecting the most probable pedigree. , 2000, Forensic science international.

[80]  Bruce Budowle,et al.  A Perspective on Errors, Bias, and Interpretation in the Forensic Sciences and Direction for Continuing Advancement * , 2009, Journal of forensic sciences.

[81]  N E Morton,et al.  Genetic structure of forensic populations. , 1992, American journal of human genetics.

[82]  K Roeder,et al.  No excess of homozygosity at loci used for DNA fingerprinting. , 1990, Science.

[83]  S. Turrina,et al.  Population study of three miniSTR loci in Veneto (Italy) , 2008 .

[84]  P. Gill,et al.  Encoded evidence: DNA in forensic analysis , 2004, Nature Reviews Genetics.

[85]  P. Gill,et al.  Identification of the remains of the Romanov family by DNA analysis , 1994, Nature Genetics.

[86]  P Gill,et al.  A comparison of the characteristics of profiles produced with the AMPFlSTR SGM Plus multiplex system for both standard and low copy number (LCN) STR DNA analysis. , 2001, Forensic science international.

[87]  James M Curran,et al.  What is the magnitude of the subpopulation effect? , 2003, Forensic science international.

[88]  J. Whitaker,et al.  Analysis and interpretation of mixed forensic stains using DNA STR profiling. , 1998, Forensic science international.

[89]  Bruce Budowle,et al.  Validity of low copy number typing and applications to forensic science. , 2009, Croatian medical journal.

[90]  Steffen L. Lauritzen,et al.  A gamma model for {DNA} mixture analyses , 2007 .

[91]  Jason R. Gilder Computational Methods for the Objective Review of Forensic DNA Testing Results , 2007 .

[92]  Benoît Leclair,et al.  AmpFlSTR profiler Plus short tandem repeat DNA analysis of casework samples, mixture samples, and nonhuman DNA samples amplified under reduced PCR volume conditions (25 microL). , 2003, Journal of forensic sciences.

[93]  Robert G Cowell,et al.  Validation of an STR peak area model. , 2009, Forensic science international. Genetics.

[94]  C. Strom,et al.  Reliability of gender determination using the polymerase chain reaction (PCR) for single cells , 1991, Journal of in Vitro Fertilization and Embryo Transfer.

[95]  Norah Rudin,et al.  An introduction to forensic DNA analysis , 2001 .

[96]  John Buckleton,et al.  Low copy number typing—Where next? , 2009 .

[97]  Angel Carracedo,et al.  DNA mixtures in forensic casework: a 4-year retrospective study. , 2003, Forensic science international.

[98]  K K Kidd,et al.  The utility of DNA typing in forensic work. , 1991, Science.

[99]  T. Egeland,et al.  Estimating the number of contributors to a DNA profile , 2003, International Journal of Legal Medicine.

[100]  C Boesch,et al.  Microsatellite scoring errors associated with noninvasive genotyping based on nuclear DNA amplified from shed hair , 1997, Molecular ecology.

[101]  Jean YH Yang,et al.  Bioconductor: open software development for computational biology and bioinformatics , 2004, Genome Biology.

[102]  J. E. Cohen,et al.  DNA fingerprinting for forensic identification: potential effects on data interpretation of subpopulation heterogeneity and band number variability. , 1990, American journal of human genetics.

[103]  Wing K. Fung,et al.  Evaluation of DNA mixtures involving two pairs of relatives , 2004, International Journal of Legal Medicine.

[104]  Peter Gill,et al.  Evaluation of an automated DNA profiling system employing multiplex amplification of four tetrameric STR loci , 2005, International Journal of Legal Medicine.

[105]  P. Taberlet,et al.  Genotyping errors: causes, consequences and solutions , 2005, Nature Reviews Genetics.

[106]  Mark W. Perlin,et al.  An Information Gap in DNA Evidence Interpretation , 2009, PloS one.

[107]  Eric S. Lander,et al.  DNA fingerprinting dispute laid to rest , 1994, Nature.

[108]  Peter Gill,et al.  Interpretation of simple mixtures of when artefacts such as stutters are present : with special reference to multiplex STRs used by the forensic science service , 1998 .

[109]  Tsewei Wang,et al.  Least‐Square Deconvolution: A Framework for Interpreting Short Tandem Repeat Mixtures * , 2006, Journal of forensic sciences.

[110]  Niels Morling,et al.  Estimating the probability of allelic drop-out of STR alleles in forensic genetics. , 2009, Forensic science international. Genetics.

[111]  J. Weber,et al.  Abundant class of human DNA polymorphisms which can be typed using the polymerase chain reaction. , 1989, American journal of human genetics.

[112]  P Gill,et al.  Application of low copy number DNA profiling. , 2001, Croatian medical journal.

[113]  Mark W. Perlin Cybergenetics Scientific Validation of Mixture Interpretation Methods , 2006 .

[114]  Yajun Deng,et al.  Population genetic analysis of 15 STR loci of Chinese Tu ethnic minority group. , 2008, Forensic science international.