The Diversity of REcent and Ancient huMan (DREAM): A New Microarray for Genetic Anthropology and Genealogy, Forensics, and Personalized Medicine

Abstract The human population displays wide variety in demographic history, ancestry, content of DNA derived from hominins or ancient populations, adaptation, traits, copy number variation, drug response, and more. These polymorphisms are of broad interest to population geneticists, forensics investigators, and medical professionals. Historically, much of that knowledge was gained from population survey projects. Although many commercial arrays exist for genome-wide single-nucleotide polymorphism genotyping, their design specifications are limited and they do not allow a full exploration of biodiversity. We thereby aimed to design the Diversity of REcent and Ancient huMan (DREAM)—an all-inclusive microarray that would allow both identification of known associations and exploration of standing questions in genetic anthropology, forensics, and personalized medicine. DREAM includes probes to interrogate ancestry informative markers obtained from over 450 human populations, over 200 ancient genomes, and 10 archaic hominins. DREAM can identify 94% and 61% of all known Y and mitochondrial haplogroups, respectively, and was vetted to avoid interrogation of clinically relevant markers. To demonstrate its capabilities, we compared its FST distributions with those of the 1000 Genomes Project and commercial arrays. Although all arrays yielded similarly shaped (inverse J) FST distributions, DREAM’s autosomal and X-chromosomal distributions had the highest mean FST, attesting to its ability to discern subpopulations. DREAM performances are further illustrated in biogeographical, identical by descent, and copy number variation analyses. In summary, with approximately 800,000 markers spanning nearly 2,000 genes, DREAM is a useful tool for genetic anthropology, forensic, and personalized medicine studies.

[1]  T. Allen,et al.  The SAGE Handbook of Social Anthropology , 2012 .

[2]  Fernando Racimo,et al.  Signatures of Archaic Adaptive Introgression in Present-Day Human Populations , 2016, bioRxiv.

[3]  Mathieu Bastian,et al.  Gephi: An Open Source Software for Exploring and Manipulating Networks , 2009, ICWSM.

[4]  M. Poetsch,et al.  Determination of population origin: a comparison of autosomal SNPs, Y-chromosomal and mtDNA haplogroups using a Malagasy population as example , 2013, European Journal of Human Genetics.

[5]  M. Pirooznia,et al.  Reconstructing Druze population history , 2016, Scientific Reports.

[6]  The Power of Intelligent SNP Selection The Infinium , 2012 .

[7]  J. Haines,et al.  eMERGEing progress in genomics—the first seven years , 2014, Front. Genet..

[8]  C. Tyler-Smith,et al.  Human Evolutionary Genetics , 2004 .

[9]  Bonnie Berger,et al.  Genetic evidence for recent population mixture in India. , 2013, American journal of human genetics.

[10]  Bonnie Berger,et al.  Ancient human genomes suggest three ancestral populations for present-day Europeans , 2013, Nature.

[11]  D. Altshuler,et al.  A map of human genome variation from population-scale sequencing , 2010, Nature.

[12]  R. Nielsen,et al.  Ascertainment biases in SNP chips affect measures of population divergence. , 2010, Molecular biology and evolution.

[13]  Gonçalo R. Abecasis,et al.  The variant call format and VCFtools , 2011, Bioinform..

[14]  M. DePristo,et al.  A framework for variation discovery and genotyping using next-generation DNA sequencing data , 2011, Nature Genetics.

[15]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[16]  Søren Brunak,et al.  Population genomics of Bronze Age Eurasia , 2015, Nature.

[17]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[18]  Bradley P. Coe,et al.  Global diversity, population stratification, and selection of human copy-number variation , 2015, Science.

[19]  R. Durbin,et al.  Iron Age and Anglo-Saxon genomes from East England reveal British migration history , 2015, Nature Communications.

[20]  Christian von Mering,et al.  STRING 8—a global view on proteins and their functional interactions in 630 organisms , 2008, Nucleic Acids Res..

[21]  Ajay K. Royyuru,et al.  Geographic population structure analysis of worldwide human populations infers their biogeographical origins , 2014, Nature Communications.

[22]  S. Fullerton,et al.  Genomics is failing on diversity , 2016, Nature.

[23]  M. DePristo,et al.  The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. , 2010, Genome research.

[24]  L. Hedges Distribution Theory for Glass's Estimator of Effect size and Related Estimators , 1981 .

[25]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[26]  A. LaCroix,et al.  Genetic factors associated with longevity: A review of recent findings , 2015, Ageing Research Reviews.

[27]  S. Yusuf,et al.  Interpreting Geographic Variations in Results of Randomized, Controlled Trials. , 2016, The New England journal of medicine.

[28]  M. Pirooznia,et al.  Localizing Ashkenazic Jews to Primeval Villages in the Ancient Iranian Lands of Ashkenaz , 2016, Genome biology and evolution.

[29]  B. Browning,et al.  Improving the Accuracy and Efficiency of Identity-by-Descent Detection in Population Data , 2013, Genetics.

[30]  D. Venzon,et al.  Clinical pharmacology and pharmacogenetics in a genomics era: the DMET platform. , 2010, Pharmacogenomics.

[31]  R. Desnick,et al.  Warfarin pharmacogenetics: CYP2C9 and VKORC1 genotypes predict different sensitivity and resistance frequencies in the Ashkenazi and Sephardi Jewish populations. , 2008, American journal of human genetics.

[32]  Manuel A. R. Ferreira,et al.  PLINK: a tool set for whole-genome association and population-based linkage analyses. , 2007, American journal of human genetics.

[33]  R. Redon,et al.  Copy Number Variation: New Insights in Genome Diversity References , 2006 .

[34]  J. R. MacDonald,et al.  A copy number variation map of the human genome , 2015, Nature Reviews Genetics.

[35]  Bethany Percha,et al.  Genetic variant in folate homeostasis is associated with lower warfarin dose in African Americans. , 2014, Blood.

[36]  S. Sawyer,et al.  Nuclear and mitochondrial DNA sequences from two Denisovan individuals , 2015, Proceedings of the National Academy of Sciences.

[37]  Mattias Jakobsson,et al.  Genomic Diversity and Admixture Differs for Stone-Age Scandinavian Foragers and Farmers , 2014, Science.

[38]  M. Pirooznia,et al.  The Origins of Ashkenaz, Ashkenazic Jews, and Yiddish , 2017, Front. Genet..

[39]  R. Altman,et al.  Pharmacogenomics Knowledge for Personalized Medicine , 2012, Clinical pharmacology and therapeutics.

[40]  Arcadi Navarro,et al.  Derived immune and ancestral pigmentation alleles in a 7,000-year-old Mesolithic European , 2014, Nature.

[41]  Natalie M. Myres,et al.  New insights into the Tyrolean Iceman's origin and phenotype as inferred by whole-genome sequencing , 2012, Nature Communications.

[42]  Philip L. F. Johnson,et al.  A Draft Sequence of the Neandertal Genome , 2010, Science.

[43]  Michael C. Westaway,et al.  Genomic structure in Europeans dating back at least 36,200 years , 2014, Science.

[44]  Gunes Ercal,et al.  Robust Graph-Theoretic Clustering Approaches Using Node-Based Resilience Measures , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[45]  Swapan Mallick,et al.  Massive migration from the steppe was a source for Indo-European languages in Europe , 2015, Nature.

[46]  Heng Li,et al.  A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data , 2011, Bioinform..

[47]  Julie A. Johnson,et al.  Ethnic differences in cardiovascular drug response: potential contribution of pharmacogenetics. , 2008, Circulation.

[48]  T. Ohta,et al.  The age of a neutral mutant persisting in a finite population. , 1973, Genetics.

[49]  Carl Baker,et al.  Evolution and diversity of copy number variation in the great ape lineage , 2013, Genome research.

[50]  H. Lambert The SAGE handbook of social anthropology , 2012 .

[51]  Adrian W. Briggs,et al.  A High-Coverage Genome Sequence from an Archaic Denisovan Individual , 2012, Science.

[52]  Ricardo Villamarín-Salomón,et al.  ClinVar: public archive of interpretations of clinically relevant variants , 2015, Nucleic Acids Res..

[53]  E. Elhaik Empirical Distributions of F ST from Large-Scale Human Polymorphism Data , 2012, PloS one.

[54]  S. Wright,et al.  Genetical Structure of Populations , 1950, Nature.

[55]  Luca Pagani,et al.  The GenoChip: A New Tool for Genetic Anthropology , 2012, Genome biology and evolution.

[56]  S. Tishkoff,et al.  SNP ascertainment bias in population genetic analyses: Why it is important, and how to correct it , 2013, BioEssays : news and reviews in molecular, cellular and developmental biology.

[57]  G. A. Watterson,et al.  Is the most frequent allele the oldest? , 1977, Theoretical population biology.

[58]  János Dani,et al.  Genome flux and stasis in a five millennium transect of European prehistory , 2014, Nature Communications.

[59]  S WRIGHT,et al.  Genetical structure of populations. , 1950, Nature.

[60]  Manfred Kayser,et al.  Forensic DNA Phenotyping: Predicting human appearance from crime scene material for investigative purposes. , 2015, Forensic science international. Genetics.

[61]  D. Graur,et al.  IsoPlotter+: A Tool for Studying the Compositional Architecture of Genomes , 2013, ISRN bioinformatics.

[62]  Yancy Lo,et al.  Going global by adapting local: A review of recent human adaptation , 2016, Science.

[63]  R. Mägi,et al.  Upper Palaeolithic Siberian genome reveals dual ancestry of Native Americans , 2013, Nature.

[64]  F. Cunningham,et al.  The Ensembl Variant Effect Predictor , 2016, Genome Biology.

[65]  David Goldman,et al.  Using ancestry-informative markers to define populations and detect population stratification , 2006, Journal of psychopharmacology.

[66]  Heng Li,et al.  Genome sequence of a 45,000-year-old modern human from western Siberia , 2014, Nature.