Genome-wide screening for highly discriminative SNPs for personal identification and their assessment in world populations.

The applications of DNA profiling aim to identify perpetrators, missing family members and disaster victims in forensic investigations. Single nucleotide polymorphisms (SNPs) based forensic applications are emerging rapidly with a potential to replace short tandem repeats (STRs) based panels which are now being used widely, and there is a need for a well-designed SNP panel to meet such challenge for this transition. Here we present a panel of 175 SNP markers (referred to as Fudan ID Panel or FID), selected from ∼3.6 million SNPs, for the application of personal identification. We optimized and validated FID panel using 729 Chinese individuals using a next generation sequencing (NGS) technology. We showed that the SNPs in the panel possess very high heterozygosity as well as low within- and among-continent differentiations, enabling FID panel exhibit discrimination power in both regional and worldwide populations, with the average match probabilities ranging from 4.77×10-71 to 1.06×10-64 across 54 world populations. With the advent of biomedical research, the SNPs connecting physical anthropological, physiological, behavioral and phenotypic traits will be eventually added to the forensic panels that will revolutionize criminal investigation.

[1]  Linghua Wang,et al.  Genomic sequencing for cancer diagnosis and therapy. , 2014, Annual review of medicine.

[2]  Walther Parson,et al.  Evaluation of next generation mtGenome sequencing using the Ion Torrent Personal Genome Machine (PGM)☆ , 2013, Forensic science international. Genetics.

[3]  M. Olivier A haplotype map of the human genome. , 2003, Nature.

[4]  N. Rosenberg,et al.  Standardized Subsets of the HGDP‐CEPH Human Genome Diversity Cell Line Panel, Accounting for Atypical and Duplicated Samples and Pairs of Close Relatives , 2006, Annals of human genetics.

[5]  M. Olivier A haplotype map of the human genome , 2003, Nature.

[6]  Dieter Deforce,et al.  My-Forensic-Loci-queries (MyFLq) framework for analysis of forensic STR data generated by massive parallel sequencing. , 2014, Forensic science international. Genetics.

[7]  R. Wilson,et al.  The Next-Generation Sequencing Revolution and Its Impact on Genomics , 2013, Cell.

[8]  Terrence S. Furey,et al.  The UCSC Table Browser data retrieval tool , 2004, Nucleic Acids Res..

[9]  D. Altshuler,et al.  A map of human genome variation from population-scale sequencing , 2010, Nature.

[10]  Jason Y. Park,et al.  Next-generation sequencing in the clinic , 2013, Nature Biotechnology.

[11]  W Parson,et al.  Inter-laboratory evaluation of SNP-based forensic identification by massively parallel sequencing using the Ion PGM™. , 2015, Forensic science international. Genetics.

[12]  M. Schanfield,et al.  A 50-SNP assay for biogeographic ancestry and phenotype prediction in the U.S. population. , 2014, Forensic science international. Genetics.

[13]  Á. Carracedo,et al.  Analysis of global variability in 15 established and 5 new European Standard Set (ESS) STRs using the CEPH human genome diversity panel. , 2011, Forensic science international. Genetics.

[14]  Bruce Budowle,et al.  STRait Razor: a length-based forensic STR allele-calling tool for use with second generation sequencing data. , 2013, Forensic science international. Genetics.

[15]  Chaolong Wang,et al.  Inference of unexpected genetic relatedness among individuals in HapMap Phase III. , 2010, American journal of human genetics.

[16]  David H. Warshauer,et al.  Single nucleotide polymorphism typing with massively parallel sequencing for human identification , 2013, International Journal of Legal Medicine.

[17]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[18]  K K Kidd,et al.  The utility of DNA typing in forensic work. , 1991, Science.

[19]  Life Technologies,et al.  A map of human genome variation from population-scale sequencing , 2011 .

[20]  Peng Chen,et al.  Insights into the Genetic Structure and Diversity of 38 South Asian Indians from Deep Whole-Genome Sequencing , 2014, PLoS genetics.

[21]  Gina M Dembinski,et al.  Evaluation of the IrisPlex DNA-based eye color prediction assay in a United States population. , 2014, Forensic science international. Genetics.

[22]  Gabor T. Marth,et al.  A global reference for human genetic variation , 2015, Nature.

[23]  Niels Morling,et al.  Second-generation sequencing of forensic STRs using the Ion Torrent™ HID STR 10-plex and the Ion PGM™. , 2015, Forensic science international. Genetics.

[24]  R. Decorte,et al.  Towards a consensus Y-chromosomal phylogeny and Y-SNP set in forensics in the next-generation sequencing era. , 2015, Forensic science international. Genetics.

[25]  L. Jin,et al.  Evaluation of 13 short tandem repeat loci for use in personal identification applications. , 1994, American journal of human genetics.

[26]  D. Deforce,et al.  Forensic STR analysis using massive parallel sequencing. , 2012, Forensic science international. Genetics.

[27]  Lars Feuk,et al.  The Database of Genomic Variants: a curated collection of structural variation in the human genome , 2013, Nucleic Acids Res..

[28]  Á. Carracedo,et al.  A multiplex assay with 52 single nucleotide polymorphisms for human identification , 2006, Electrophoresis.

[29]  M. Allen,et al.  Forensic analysis of autosomal STR markers using Pyrosequencing. , 2010, Forensic science international. Genetics.

[30]  Kenneth K. Kidd,et al.  SNPs for a universal individual identification panel , 2010, Human Genetics.

[31]  M. Feldman,et al.  Genetic Structure of Human Populations , 2002, Science.

[32]  Sharon R Grossman,et al.  Integrating common and rare genetic variation in diverse human populations , 2010, Nature.

[33]  Rebecca Just,et al.  Short tandem repeat typing on the 454 platform: strategies and considerations for targeted sequencing of common forensic markers. , 2014, Forensic science international. Genetics.

[34]  Kenny Q. Ye,et al.  An integrated map of genetic variation from 1,092 human genomes , 2012, Nature.

[35]  Manfred Kayser,et al.  Improving human forensics through advances in genetics, genomics and molecular biology , 2011, Nature Reviews Genetics.

[36]  Titia Sijen,et al.  Developmental validation of the HIrisPlex system: DNA-based eye and hair colour prediction for forensic and anthropological usage. , 2014, Forensic science international. Genetics.

[37]  S. Quake,et al.  The promise and challenge of high-throughput sequencing of the antibody repertoire , 2014, Nature Biotechnology.

[38]  K. Kidd,et al.  Developing a SNP panel for forensic identification of individuals. , 2006, Forensic science international.

[39]  Richard Durbin,et al.  Fast and accurate long-read alignment with Burrows–Wheeler transform , 2010, Bioinform..

[40]  Manuel A. R. Ferreira,et al.  PLINK: a tool set for whole-genome association and population-based linkage analyses. , 2007, American journal of human genetics.

[41]  Peng Chen,et al.  Deep whole-genome sequencing of 100 southeast Asian Malays. , 2013, American journal of human genetics.

[42]  K. Kidd,et al.  Progress toward an efficient panel of SNPs for ancestry inference. , 2014, Forensic science international. Genetics.

[43]  B S Weir,et al.  Estimating F-statistics. , 2002, Annual review of genetics.

[44]  Zhaohui S. Qin,et al.  A second generation human haplotype map of over 3.1 million SNPs , 2007, Nature.

[45]  D N Stivers,et al.  The utility of short tandem repeat loci beyond human identification: Implications for development of new DNA typing systems , 1999, Electrophoresis.