Investigation into the performance of different models for predicting stutter.

In this paper we have examined five possible models for the behaviour of the stutter ratio, SR. These were two log-normal models, two gamma models, and a two-component normal mixture model. A two-component normal mixture model was chosen with different behaviours of variance; at each locus SR was described with two distributions, both with the same mean. The distributions have difference variances: one for the majority of the observations and a second for the less well-behaved ones. We apply each model to a set of known single source Identifiler™, NGM SElect™ and PowerPlex(®) 21 DNA profiles to show the applicability of our findings to different data sets. SR determined from the single source profiles were compared to the calculated SR after application of the models. The model performance was tested by calculating the log-likelihoods and comparing the difference in Akaike information criterion (AIC). The two-component normal mixture model systematically outperformed all others, despite the increase in the number of parameters. This model, as well as performing well statistically, has intuitive appeal for forensic biologists and could be implemented in an expert system with a continuous method for DNA interpretation.

[1]  Andrew Thomas,et al.  Rejoinder to commentaries on ‘The BUGS project: Evolution, critique and future directions’ , 2009, Statistics in Medicine.

[2]  P. Walsh,et al.  Sequence analysis and characterization of stutter products at the tetranucleotide repeat locus vWA. , 1996, Nucleic acids research.

[3]  J. Herbergs,et al.  A stochastic model of the processes in PCR based amplification of STR DNA in forensic applications. , 2012, Forensic science international. Genetics.

[4]  T. Tvedebrink,et al.  Evaluating the weight of evidence by using quantitative short tandem repeat data in DNA mixtures , 2010 .

[5]  Steffen L. Lauritzen,et al.  A gamma model for {DNA} mixture analyses , 2007 .

[6]  W R Mayr,et al.  DNA commission of the International Society of Forensic Genetics: Recommendations on the interpretation of mixtures. , 2006, Forensic science international.

[7]  P. McCullagh,et al.  Generalized Linear Models , 1984 .

[8]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[9]  Duncan Taylor,et al.  Developing allelic and stutter peak height models for a continuous method of DNA interpretation. , 2013, Forensic science international. Genetics.

[10]  P. McCullagh,et al.  Generalized Linear Models , 1992 .

[11]  M. Perlin,et al.  Validating TrueAllele® DNA Mixture Interpretation * ,† , 2011, Journal of forensic sciences.

[12]  Jo-Anne Bright,et al.  Characterising stutter in forensic STR multiplexes. , 2012, Forensic science international. Genetics.

[13]  John M. Butler,et al.  STRBase: a short tandem repeat DNA database for the human identity testing community , 2001, Nucleic Acids Res..

[14]  M. Rohan Using Finite Mixtures to Robustify Statistical Models , 2011 .

[15]  Jo-Anne Bright,et al.  A manual and automated method for the forensic analysis of DNA from buccal samples on Whatman Indicating FTA Elute Cards , 2012 .

[16]  Jo-Anne Bright,et al.  Modelling heterozygote balance in forensic DNA profiles. , 2012, Forensic science international. Genetics.

[17]  Andrew Thomas,et al.  The BUGS project: Evolution, critique and future directions , 2009, Statistics in medicine.