Amplification of DNA mixtures—Missing data approach
暂无分享,去创建一个
Abstract This paper presents a model for the interpretation of results of STR typing of DNA mixtures based on a multivariate normal distribution of peak areas. From previous analyses of controlled experiments with mixed DNA samples, we exploit the linear relationship between peak heights and peak areas, and the linear relations of the means and variances of the measurements. Furthermore, the contribution from one individual's allele to the mean area of this allele is assumed proportional to the average of height measurements on alleles where the individual is the only contributor. For shared alleles in mixed DNA samples, it is only possible to observe the cumulative peak heights and areas. Complying with this latent structure, we use the EM-algorithm to impute the missing variables based on a compound symmetry model. That is the measurements are subject to intra- and inter-loci correlations not depending on the actual alleles of the DNA profiles. Due to factorization of the likelihood, properties of the normal distribution and use of auxiliary variables, an ordinary implementation of the EM-algorithm solves the missing data problem. We estimate the parameters in the model based on a training data set. In order to assess the weight of evidence provided by the model, we use the model with the estimated parameters on STR data from real crime cases with DNA mixtures.