Impact of SNP microarray analysis of compromised DNA on kinship classification success in the context of investigative genetic genealogy

Single nucleotide polymorphism (SNP) data generated with microarray technologies have been used to solve murder cases via investigative leads obtained from identifying relatives of the unknown perpetrator included in accessible genomic databases, referred to as investigative genetic genealogy (IGG). However, SNP microarrays were developed for relatively high input DNA quantity and quality, while SNP microarray data from compromised DNA typically obtainable from crime scene stains are largely missing. By applying the Illumina Global Screening Array (GSA) to 264 DNA samples with systematically altered quantity and quality, we empirically tested the impact of SNP microarray analysis of deprecated DNA on kinship classification success, as relevant in IGG. Reference data from manufacturer-recommended input DNA quality and quantity were used to estimate genotype accuracy in the compromised DNA samples and for simulating data of different degree relatives. Although stepwise decrease of input DNA amount from 200 nanogram to 6.25 picogram led to decreased SNP call rates and increased genotyping errors, kinship classification success did not decrease down to 250 picogram for siblings and 1st cousins, 1 nanogram for 2nd cousins, while at 25 picogram and below kinship classification success was zero. Stepwise decrease of input DNA quality via increased DNA fragmentation resulted in the decrease of genotyping accuracy as well as kinship classification success, which went down to zero at the average DNA fragment size of 150 base pairs. Combining decreased DNA quantity and quality in mock casework and skeletal samples further highlighted possibilities and limitations. Overall, GSA analysis achieved maximal kinship classification success from 800-200 times lower input DNA quantities than manufacturer-recommended, although DNA quality plays a key role too, while compromised DNA produced false negative kinship classifications rather than false positive ones. Author Summary Investigative genetic genealogy (IGG), i.e., identifying unknown perpetrators of crime via genomic database-tracing of their relatives by means of microarray-based single nucleotide polymorphism (SNP) data, is a recently emerging field. However, SNP microarrays were developed for much higher DNA quantity and quality than typically available from crime scenes, while SNP microarray data on quality and quantity compromised DNA are largely missing. As first attempt to investigate how SNP microarray analysis of quantity and quality compromised DNA impacts kinship classification success in the context of IGG, we performed systematic SNP microarray analyses on DNA samples below the manufacturer-recommended quantity and quality as well as on mock casework samples and on skeletal remains. In addition to IGG, our results are also relevant for any SNP microarray analysis of compromised DNA, such as for the DNA prediction of appearance and biogeographic ancestry in forensics and anthropology and for other purposes.

[1]  C. Phillips,et al.  Investigative genetic genealogy: Current methods, knowledge and practice. , 2021, Forensic science international. Genetics.

[2]  Bruce Budowle,et al.  Forensic investigation approaches of searching relatives in DNA databases , 2020, Journal of forensic sciences.

[3]  B. Budowle,et al.  How many familial relationship testing results could be wrong? , 2020, PLoS genetics.

[4]  Michael D. Edge,et al.  Donnelly (1983) and the limits of genetic genealogy. , 2020, Theoretical population biology.

[5]  S. Katsanis Pedigrees and Perpetrators: Uses of DNA and Genealogy in Forensic Investigations. , 2020, Annual review of genomics and human genetics.

[6]  H. Gréen,et al.  Whole-genome sequencing of human remains to enable genealogy DNA database searches - A case report. , 2020, Forensic science international. Genetics.

[7]  M. Kayser,et al.  The Use of Forensic DNA Phenotyping in Predicting Appearance and Biogeographic Ancestry. , 2019, Deutsches Arzteblatt international.

[8]  O. Holmen,et al.  Ultralow amounts of DNA from long-term archived serum samples produce quality genotypes , 2019, European Journal of Human Genetics.

[9]  Daniel Kling,et al.  Forensic genealogy-A comparison of methods to infer distant relationships based on dense SNP data. , 2019, Forensic science international. Genetics.

[10]  Jessica Roberts,et al.  Forensic genealogy and the power of defaults , 2019, Nature Biotechnology.

[11]  Ellen M. Greytak,et al.  Genetic genealogy for cold case and active investigations. , 2019, Forensic science international.

[12]  Carolyn R. Steffen,et al.  The impact of common PCR inhibitors on forensic MPS analysis. , 2019, Forensic science international. Genetics.

[13]  A. Sajantila,et al.  A genome-wide association study of tramadol metabolism from post-mortem samples , 2019, The Pharmacogenomics Journal.

[14]  Daniel Kling,et al.  On the use of dense sets of SNP markers and their potential in relationship inference. , 2019, Forensic science international. Genetics.

[15]  Yaniv Erlich,et al.  Identity inference of genomic data using long-range familial searches , 2018, Science.

[16]  J Watherston,et al.  Current and emerging tools for the recovery of genetic information from post mortem samples: New directions for disaster victim identification. , 2018, Forensic science international. Genetics.

[17]  C. Phillips The Golden State Killer investigation and the nascent field of forensic genealogy. , 2018, Forensic science international. Genetics.

[18]  E. Murphy Law and policy oversight of familial searches in recreational genealogy databases. , 2018, Forensic science international.

[19]  M. Delisi Forensic epidemiology: Harnessing the power of public DNA sources to capture career criminals. , 2018, Forensic science international.

[20]  S. Chanock,et al.  Successful use of whole genome amplified DNA from multiple source types for high-density Illumina SNP microarrays , 2018, BMC Genomics.

[21]  D. Deforce,et al.  Short Tandem Repeat analysis after Whole Genome Amplification of single B-lymphoblastoid cells , 2018, Scientific Reports.

[22]  M V Lareu,et al.  Forensic individual age estimation with DNA: From initial approaches to methylation tests. , 2017, Forensic science review.

[23]  M. Kayser Forensic use of Y-chromosome DNA: a general overview , 2017, Human Genetics.

[24]  M. Jakobsson,et al.  Estimating genetic kin relationships in prehistoric populations , 2017, bioRxiv.

[25]  Shane A. McCarthy,et al.  Reference-based phasing using the Haplotype Reference Consortium panel , 2016, Nature Genetics.

[26]  A. Kloosterman,et al.  Knowledge on DNA Success Rates to Optimize the DNA Analysis Process: From Crime Scene to Laboratory , 2016, Journal of forensic sciences.

[27]  Gabor T. Marth,et al.  A global reference for human genetic variation , 2015, Nature.

[28]  David Kulp,et al.  A guide to genome‐wide association analysis and post‐analytic interrogation , 2015, Statistics in medicine.

[29]  Chris Phillips,et al.  Forensic genetic analysis of bio-geographical ancestry. , 2015, Forensic science international. Genetics.

[30]  Manfred Kayser,et al.  Forensic DNA Phenotyping: Predicting human appearance from crime scene material for investigative purposes. , 2015, Forensic science international. Genetics.

[31]  Sohee Cho,et al.  Forensic application of SNP-based resequencing array for individual identification. , 2014, Forensic science international. Genetics.

[32]  Brian L Browning,et al.  Detecting identity by descent and estimating genotype error rates in sequence data. , 2013, American journal of human genetics.

[33]  Cory Y. McLean,et al.  Reducing Pervasive False-Positive Identical-by-Descent Segments Detected by Large-Scale Pedigree Analysis , 2013, Molecular biology and evolution.

[34]  T. Egeland,et al.  DNA microarray as a tool in establishing genetic relatedness--Current status and future prospects. , 2012, Forensic science international. Genetics.

[35]  Á. Carracedo,et al.  Analysis of a claimed distant relationship in a deficient pedigree using high density SNP data. , 2012, Forensic science international. Genetics.

[36]  Itsik Pe'er,et al.  Cryptic Distant Relatives Are Common in Both Isolated and Cosmopolitan Genetic Samples , 2012, PloS one.

[37]  Manfred Kayser,et al.  Improving human forensics through advances in genetics, genomics and molecular biology , 2011, Nature Reviews Genetics.

[38]  B. Browning,et al.  A fast, powerful method for detecting identity by descent. , 2011, American journal of human genetics.

[39]  Niels Morling,et al.  ISFG: Recommendations on biostatistics in paternity testing. , 2007, Forensic science international. Genetics.

[40]  Francisco M De La Vega,et al.  A second-generation combined linkage physical map of the human genome. , 2007, Genome research.

[41]  D. Foran,et al.  The Utility of Whole Genome Amplification for Typing Compromised Forensic Samples , 2006, Journal of forensic sciences.

[42]  Amanda B. Hepler,et al.  Genetic relatedness analysis: modern data and new challenges , 2006, Nature Reviews Genetics.

[43]  David Lazer,et al.  Finding Criminals Through DNA of Their Relatives , 2006, Science.

[44]  P. Gill,et al.  Encoded evidence: DNA in forensic analysis , 2004, Nature Reviews Genetics.

[45]  K. Sturk-Andreaggi,et al.  Hybridization capture and low-coverage SNP profiling for extended kinship 2 analysis and forensic identification of historical remains Extended kinship analysis of historical remains using SNP capture , 2020 .

[46]  Ross E. Curtis,et al.  AncestryDNA Matching White Paper Discovering genetic matches across a massive , expanding genetic database , 2016 .

[47]  K. Lunetta,et al.  Methods in Genetics and Clinical Interpretation Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium Design of Prospective Meta-Analyses of Genome-Wide Association Studies From 5 Cohorts , 2010 .

[48]  Alexander Gusev,et al.  Whole population, genome-wide mapping of hidden relatedness. , 2009, Genome research.

[49]  G. Abecasis,et al.  Merlin—rapid analysis of dense genetic maps using sparse gene flow trees , 2002, Nature Genetics.

[50]  Y. Mukaigawa,et al.  Large Deviations Estimates for Some Non-local Equations I. Fast Decaying Kernels and Explicit Bounds , 2022 .