The STRidER Report on Two Years of Quality Control of Autosomal STR Population Datasets

STRidER, the STRs for Identity ENFSI Reference Database, is a curated, freely publicly available online allele frequency database, quality control (QC) and software platform for autosomal Short Tandem Repeats (STRs) developed under the endorsement of the International Society for Forensic Genetics. Continuous updates comprise additional STR loci and populations in the frequency database and many further STR-related aspects. One significant innovation is the autosomal STR data QC provided prior to publication of datasets. Such scrutiny was lacking previously, leaving QC to authors, reviewers and editors, which led to an unacceptably high error rate in scientific papers. The results from scrutinizing 184 STR datasets containing >177,000 individual genotypes submitted in the first two years of STRidER QC since 2017 revealed that about two-thirds of the STR datasets were either being withdrawn by the authors after initial feedback or rejected based on a conservative error rate. Almost no error-free submissions were received, which clearly shows that centralized QC and data curation are essential to maintain the high-quality standard required in forensic genetics. While many errors had minor impact on the resulting allele frequencies, multiple error categories were commonly found within single datasets. Several datasets contained serious flaws. We discuss the factors that caused the errors to draw the attention to redundant pitfalls and thus contribute to better quality of autosomal STR datasets and allele frequency reports.

[1]  Walther Parson,et al.  EMPOP--a forensic mtDNA database. , 2007, Forensic science international. Genetics.

[2]  Alexandre Gouy,et al.  STRAF-A convenient online tool for STR data evaluation in forensic genetics. , 2017, Forensic science international. Genetics.

[3]  Gillian Tully,et al.  The EDNAP mitochondrial DNA population database (EMPOP) collaborative exercises: organisation, results and perspectives. , 2004, Forensic science international.

[4]  K. Fernández,et al.  Accreditation of the GHEP-ISFG proficiency test: One step forward to assure and improve quality , 2015 .

[5]  Niels Morling,et al.  Results of the 2013 Relationship Testing Workshop of the English Speaking Working Group , 2013 .

[6]  Bruce Budowle,et al.  A Perspective on Errors, Bias, and Interpretation in the Forensic Sciences and Direction for Continuing Advancement * , 2009, Journal of forensic sciences.

[7]  Bruce Budowle,et al.  Corrigendum to "Population data on the thirteen CODIS core short tandem repeat loci in African Americans, US Caucasians, Hispanics, Bahamians, Jamaicans, and Trinidadians" [J Forensic Sci 44, 6, (1999), 1277-1286] , 2015 .

[8]  Sascha Willuweit,et al.  The new Y Chromosome Haplotype Reference Database. , 2015, Forensic science international. Genetics.

[9]  C. Cabrero,et al.  A review of the collaborative exercises on DNA typing of the Spanish and Portuguese ISFH working group , 1997, International Journal of Legal Medicine.

[10]  António Amorim,et al.  2004–2005 GEP proficiency testing programs: Special emphasis on the interlaboratory analysis of mixed stains , 2006 .

[11]  C Phillips,et al.  A genomic audit of newly-adopted autosomal STRs for forensic identification. , 2017, Forensic science international. Genetics.

[12]  Silvano Presciuttini,et al.  Italian population data for the new ENFSI/EDNAP loci D1S1656, D2S441, D10S1248, D12S391, D22S1045. The GeFI collaborative exercise and concordance study , 2011 .

[13]  Leonor Gusmão,et al.  Database sample size effect on minimum allele frequency estimation: Database comparison analysis of samples of 4652 and 560 individuals for 22 microsatellites in Colombian population , 2011 .

[14]  Carlo Robino,et al.  Validation of a large Italian Database of 15 STR loci. , 2006, Forensic science international.

[15]  Volker Weirich Completely automated interpretation of reference samples , 2019 .

[16]  Angel Carracedo,et al.  The proficiency testing program on DNA typing of the Spanish and Portuguese working group of the International Society for Forensic Genetics , 2003 .

[17]  Noah A. Rosenberg,et al.  Population Structure in a Comprehensive Genomic Data Set on Human Microsatellite Variation , 2013, G3: Genes, Genomes, Genetics.

[18]  B Budowle,et al.  The assessment of frequency estimates of Hae III-generated VNTR profiles in various reference databases. , 1994, Journal of forensic sciences.

[19]  P. Gill,et al.  Encoded evidence: DNA in forensic analysis , 2004, Nature Reviews Genetics.

[20]  Walther Parson,et al.  Report from the STRAND Working Group on the 2019 STR sequence nomenclature meeting. , 2019, Forensic science international. Genetics.

[21]  Niels Morling,et al.  Results of the 2011 Relationship Testing Workshop of the English Speaking Working Group , 2011 .

[22]  Michael Purser,et al.  Does an awareness of differing types of spreadsheet errors aid end-users in identifying spreadsheets errors? , 2008, ArXiv.

[23]  Bruce Budowle,et al.  Expanding beyond the current core STR loci: An exploration of 73 STR markers with increased diversity for enhanced DNA mixture deconvolution. , 2019, Forensic science international. Genetics.

[24]  N Morling,et al.  A report of the 1997, 1998 and 1999 Paternity Testing Workshops of the English Speaking Working Group of the International Society for Forensic Genetics. , 1997, Forensic science international.

[25]  Peter M Vallone,et al.  Corrigendum to 'U.S. Population Data for 29 Autosomal STR Loci' [Forensic Sci. Int. Genet. 7 (2013) e82-e83]. , 2017, Forensic science international. Genetics.

[26]  B Martínez-Jarreta,et al.  GHEP-ISFG collaborative exercise on mixture profiles of autosomal STRs (GHEP-MIX01, GHEP-MIX02 and GHEP-MIX03): results and evaluation. , 2014, Forensic science international. Genetics.

[27]  Marianne Schürenkamp,et al.  The GEDNAP (German DNA profiling group) blind trial concept , 2002, International Journal of Legal Medicine.

[28]  Marjan Sjerps,et al.  Error rates in forensic DNA analysis: definition, numbers, impact and communication. , 2014, Forensic science international. Genetics.

[29]  Carolyn R. Steffen,et al.  Erratum: U.S. Population Data for 29 Autosomal STR Loci , 2017 .

[30]  Á. Carracedo,et al.  Publication of population data of human polymorphisms , 2000, Forensic science international.

[31]  W Parson,et al.  European Network of Forensic Science Institutes (ENFSI): Evaluation of new commercial STR multiplexes that include the European Standard Set (ESS) of markers. , 2012, Forensic science international. Genetics.

[32]  Niels Morling,et al.  Results of the 2009 Paternity Testing Workshop of the English Speaking Working Group of the International Society for Forensic Genetics , 2009 .

[33]  P M Schneider,et al.  A brief history of the formation of DNA databases in forensic science within Europe. , 2001, Forensic science international.

[34]  W Parson,et al.  "The devil's in the detail": Release of an expanded, enhanced and dynamically revised forensic STR Sequence Guide. , 2018, Forensic science international. Genetics.

[35]  Ugo Ricci Establishment of an ISO 17025:2005 accredited forensic genetics laboratory in Italy , 2014, Accreditation and Quality Assurance.

[36]  William G. Hill,et al.  The Evaluation of Forensic DNA Evidence. By Committee on DNA Forensic Science: an Update, National Research Council. National Academy Press, 1996. 254 pages. Price £30.95, hard cover. ISBN 0 309 05395 1. , 1997 .

[37]  John M. Butler,et al.  Metrology needs and NIST resources for the forensic DNA community , 2011 .

[38]  George Lin,et al.  The feasibility of external blind DNA proficiency testing. II. Experience with actual blind tests. , 2003, Journal of forensic sciences.

[39]  Joseph L Peterson,et al.  The feasibility of external blind DNA proficiency testing. I. Background and findings. , 2003, Journal of forensic sciences.

[40]  R. Chakraborty,et al.  Sample size requirements for addressing the population genetic issues of forensic use of DNA typing. , 1992, Human biology.

[41]  Peter Gill,et al.  A comparison of adjustment methods to test the robustness of an STR DNA database comprised of 24 European populations. , 2003, Forensic science international.

[42]  Marianne Schürenkamp,et al.  The GEDNAP blind trial concept part II. Trends and developments , 2004, International Journal of Legal Medicine.

[43]  Walther Parson,et al.  Publication of population data of linearly inherited DNA markers in the International Journal of Legal Medicine , 2010, International Journal of Legal Medicine.

[44]  Leonor Gusmão,et al.  Revised guidelines for the publication of genetic population data. , 2017, Forensic science international. Genetics.

[45]  Jonathan J. Koehler,et al.  Proficiency Tests to Estimate Error Rates in the Forensic Sciences , 2013 .

[46]  E Carnevali,et al.  The 2011 GeFI collaborative exercise. Concordance study, proficiency testing and Italian population data on the new ENFSI/EDNAP loci D1S1656, D2S441, D10S1248, D12S391, D22S1045. , 2013, Forensic science international. Genetics.

[47]  Bruce Budowle,et al.  Massively parallel sequencing of forensic STRs: Considerations of the DNA commission of the International Society for Forensic Genetics (ISFG) on minimal nomenclature requirements. , 2016, Forensic science international. Genetics.

[48]  B Brinkmann,et al.  Report of the European DNA profiling group (EDNAP)--towards standardisation of short tandem repeat (STR) loci. , 1994, Forensic science international.

[49]  P. F. Kauff Group , 2000, Elegant Design.

[50]  Ricky Ansell Internal quality control in forensic DNA analysis , 2013, Accreditation and Quality Assurance.

[51]  Á. Carracedo,et al.  The 1998-1999 collaborative exercises and proficiency testing program on DNA typing of the Spanish and Portuguese Working Group of the International Society for Forensic Genetics (GEP-ISFG). , 2000, Forensic science international.

[52]  António Amorim,et al.  GEP-ISFG proficiency testing programs: 2007 update , 2008 .

[53]  Linzi Wilson-Wilde,et al.  The Analysis of Australian Proficiency Test Data over a Ten-Year Period , 2017 .

[54]  Niels Morling,et al.  Recommendations of the DNA Commission of the International Society for Forensic Genetics (ISFG) on quality control of autosomal Short Tandem Repeat allele frequency databasing (STRidER). , 2016, Forensic science international. Genetics.

[55]  James Curran,et al.  Population-specific FST values for forensic STR markers: A worldwide survey. , 2016, Forensic science international. Genetics.

[56]  IAN W. EVETT Trivial error , 1991, Nature.

[57]  Niels Morling,et al.  A report of the 2002-2008 paternity testing workshops of the English speaking working group of the International Society for Forensic Genetics. , 2007, Forensic Science International: Genetics.

[58]  Bruce Budowle,et al.  European survey on forensic applications of massively parallel sequencing. , 2017, Forensic science international. Genetics.

[59]  Peter M Vallone,et al.  Allele frequencies for 15 autosomal STR loci on U.S. Caucasian, African American, and Hispanic populations. , 2003, Journal of forensic sciences.

[60]  Niels Morling,et al.  Results of the 2015 Relationship Testing Workshop of the English Speaking Working Group , 2015 .

[61]  R. Chakraborty,et al.  Estimating minimum allele frequencies for DNA profile frequency estimates for PCR-based loci , 2005, International Journal of Legal Medicine.

[62]  Sheila Willis,et al.  Interpol review of forensic biology and forensic DNA typing 2016-2019 , 2020, Forensic science international. Synergy.

[63]  B Budowle,et al.  Population data on the thirteen CODIS core short tandem repeat loci in African Americans, U.S. Caucasians, Hispanics, Bahamians, Jamaicans, and Trinidadians. , 1999, Journal of forensic sciences.

[64]  David L Duewer,et al.  U.S. population data for 29 autosomal STR loci. , 2013, Forensic science international. Genetics.

[65]  Bruce Budowle,et al.  STRSeq: A catalog of sequence diversity at human identification Short Tandem Repeat loci. , 2017, Forensic science international. Genetics.

[66]  Josefina Gómez,et al.  GEP proficiency testing program in forensic genetics: 10 years of experience , 2004 .

[67]  Heidi Pfeiffer,et al.  The publication of population genetic data in the International Journal of Legal Medicine: guidelines , 2012, International Journal of Legal Medicine.

[68]  Walther Parson,et al.  Human settlement history between Sunda and Sahul: a focus on East Timor (Timor-Leste) and the Pleistocenic mtDNA diversity , 2015, BMC Genomics.

[69]  Fernando Freitas,et al.  A worldwide database of autosomal markers used by the forensic community , 2008 .

[70]  John M. Butler,et al.  U.S. initiatives to strengthen forensic science & international standards in forensic DNA , 2015, Forensic science international. Genetics.

[71]  P Gill,et al.  Euroforgen-NoE collaborative exercise on LRmix to demonstrate standardization of the interpretation of complex DNA profiles. , 2014, Forensic science international. Genetics.

[72]  N Pinto,et al.  Improving publication quality and the importance of Post Publication Peer Review: The illustrating example of X chromosome analysis and calculation of forensic parameters. , 2019, Forensic science international. Genetics.

[73]  Charles H Brenner,et al.  Fundamental problem of forensic mathematics--the evidential value of a rare haplotype. , 2010, Forensic science international. Genetics.

[74]  Niels Morling,et al.  Results of the 2007 Paternity Testing Workshop of the English Speaking Working Group of the International Society for Forensic Genetics , 2008 .

[75]  John M Butler,et al.  NIST interlaboratory studies involving DNA mixtures (MIX05 and MIX13): Variation observed and lessons learned. , 2018, Forensic science international. Genetics.

[76]  Ian W. Evett,et al.  Statistical Inference in Crime Investigations Using Deoxyribonucleic Acid Profiling , 1992 .

[77]  Niels Morling,et al.  A report of the 2000 and 2001 paternity testing workshops of the English speaking working group of the international society for forensic genetics. , 2002, Forensic science international.

[78]  Linzi Wilson-Wilde,et al.  Error rates in proficiency testing in Australia , 2019, Australian Journal of Forensic Sciences.

[79]  M V Lareu,et al.  Analysis of uni and bi-parental markers in mixture samples: Lessons from the 22nd GHEP-ISFG Intercomparison Exercise. , 2016, Forensic science international. Genetics.

[80]  W Parson,et al.  D5S2500 is an ambiguously characterized STR: Identification and description of forensic microsatellites in the genomics age. , 2016, Forensic science international. Genetics.

[81]  Silvano Presciuttini,et al.  Allele sharing in first-degree and unrelated pairs of individuals in the Ge F I AmpFlSTR Profiler Plus database. , 2003, Forensic science international.

[82]  Peter M Schneider,et al.  Scientific standards for studies in forensic genetics. , 2007, Forensic science international.