Proportioning whole-genome single-nucleotide-polymorphism diversity for the identification of geographic population structure and genetic ancestry.

The identification of geographic population structure and genetic ancestry on the basis of a minimal set of genetic markers is desirable for a wide range of applications in medical and forensic sciences. However, the absence of sharp discontinuities in the neutral genetic diversity among human populations implies that, in practice, a large number of neutral markers will be required to identify the genetic ancestry of one individual. We showed that it is possible to reduce the amount of markers required for detecting continental population structure to only 10 single-nucleotide polymorphisms (SNPs), by applying a newly developed ascertainment algorithm to Affymetrix GeneChip Mapping 10K SNP array data that we obtained from samples of globally dispersed human individuals (the Y Chromosome Consortium panel). Furthermore, this set of SNPs was able to recover the genetic ancestry of individuals from all four continents represented in the original data set when applied to an independent, much larger, worldwide population data set (Centre d'Etude du Polymorphisme Humain-Human Genome Diversity Project Cell Line Panel). Finally, we provide evidence that the unusual patterns of genetic variation we observed at the respective genomic regions surrounding the five most informative SNPs is in agreement with local positive selection being the explanation for the striking SNP allele-frequency differences we found between continental groups of human populations.

[1]  S. Zegura,et al.  Human Evolutionary Genetics: Origins, Peoples and Disease. , 2005 .

[2]  M. Bamshad,et al.  Signatures of natural selection in the human genome , 2003, Nature Reviews Genetics.

[3]  K. W. Jones,et al.  on a High-Density Oligonucleotide Array Parallel Genotyping of Over 10 , 000 SNPs Using a One-Primer Assay Material Supplemental , 2004 .

[4]  H. Harpending,et al.  Genetic perspectives on human origins and differentiation. , 2000, Annual review of genomics and human genetics.

[5]  S. Sherry,et al.  Patterns of human diversity, within and among continents, inferred from biallelic DNA polymorphisms. , 2002, Genome research.

[6]  R. Ward,et al.  Informativeness of genetic markers for inference of ancestry. , 2003, American journal of human genetics.

[7]  Michael J Bamshad,et al.  Human population genetic structure and inference of group membership. , 2003, American journal of human genetics.

[8]  Mark D Shriver,et al.  The genomic distribution of population substructure in four populations using 8,525 autosomal SNPs , 2004, Human Genomics.

[9]  D. F. Roberts,et al.  The History and Geography of Human Genes , 1996 .

[10]  Simon Easteal,et al.  Number of SNPS Loci Needed to Detect Population Structure , 2003, Human Heredity.

[11]  Randy L. Haupt,et al.  Practical Genetic Algorithms , 1998 .

[12]  Y. Fu,et al.  Estimating effective population size or mutation rate using the frequencies of mutations of various classes in a sample of DNA sequences. , 1994, Genetics.

[13]  P. Donnelly,et al.  Inference of population structure using multilocus genotype data. , 2000, Genetics.

[14]  Geoffrey B. Nilsen,et al.  Whole-Genome Patterns of Common DNA Variation in Three Human Populations , 2005, Science.

[15]  Pardis C Sabeti,et al.  Detecting recent positive selection in the human genome from haplotype structure , 2002, Nature.

[16]  Alberto Piazza,et al.  The History and Geography of Human Genes: Abridged paperback Edition , 1996 .

[17]  Hongzhe Li,et al.  Examination of ancestry and ethnic affiliation using highly informative diallelic DNA markers: application to diverse and admixed populations and implications for clinical epidemiology and forensic medicine , 2005, Human Genetics.

[18]  B Brinkmann,et al.  Mutation rates at two human Y-chromosomal microsatellite loci using small pool PCR techniques. , 2001, Human molecular genetics.

[19]  D. Kelsell,et al.  Mutations in ABCA12 underlie the severe congenital skin disease harlequin ichthyosis. , 2005, American journal of human genetics.

[20]  Sohini Ramachandran,et al.  Support from the relationship of genetic and geographic distance in human populations for a serial founder effect originating in Africa. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[21]  W. G. Hill,et al.  Measures of human population structure show heterogeneity among genomic regions. , 2005, Genome research.

[22]  M. Feldman,et al.  Clines, Clusters, and the Effect of Study Design on the Inference of Human Population Structure , 2005, PLoS genetics.

[23]  P. McKeigue,et al.  Prospects for admixture mapping of complex traits. , 2005, American journal of human genetics.

[24]  Giovanni Montana,et al.  Statistical tests for admixture mapping with case-control and cases-only data. , 2004, American journal of human genetics.

[25]  Michael Bamshad,et al.  Deconstructing the relationship between genetics and race , 2004, Nature Reviews Genetics.

[26]  Jukka Corander,et al.  BAPS 2: enhanced possibilities for the analysis of genetic population structure , 2004, Bioinform..

[27]  D. Allison,et al.  Estimating African American admixture proportions by use of population-specific alleles. , 1998, American journal of human genetics.

[28]  M. Feldman,et al.  Genetic Structure of Human Populations , 2002, Science.

[29]  W. Wong,et al.  Detect and adjust for population stratification in population-based association study using genomic control markers: an application of Affymetrix Genechip® Human Mapping 10K array , 2004, European Journal of Human Genetics.

[30]  D. Carvalho-Silva,et al.  The phylogeography of Brazilian Y-chromosome lineages. , 2001, American journal of human genetics.

[31]  A W F Edwards,et al.  Human genetic diversity: Lewontin's fallacy. , 2003, BioEssays : news and reviews in molecular, cellular and developmental biology.

[32]  Jerilyn A. Walker,et al.  Inference of human geographic origins using Alu insertion polymorphisms. , 2005, Forensic science international.

[33]  A Sajantila,et al.  Characteristics and frequency of germline mutations at microsatellite loci from the human Y chromosome, as revealed by direct observation in father/son pairs. , 2000, American journal of human genetics.

[34]  Andrew G. Clark,et al.  Haplotype Diversity and Linkage Disequilibrium at Human G6PD: Recent Origin of Alleles That Confer Malarial Resistance , 2001, Science.

[35]  J. Mountain,et al.  Impact of human population history on distributions of individual-level genetic distance , 2005, Human Genomics.

[36]  Gudmundur A. Thorisson,et al.  The International HapMap Project Web site. , 2005, Genome research.

[37]  P. Donnelly,et al.  The effects of human population structure on large genetic association studies , 2004, Nature Genetics.

[38]  Pardis C Sabeti,et al.  Genetic signatures of strong recent positive selection at the lactase gene. , 2004, American journal of human genetics.

[39]  Guido Barbujani,et al.  Africans and Asians abroad: genetic diversity in Europe. , 2004, Annual review of genomics and human genetics.

[40]  M. Stoneking,et al.  A genome scan to detect candidate regions influenced by local natural selection in human populations. , 2003, Molecular biology and evolution.

[41]  M W Feldman,et al.  Recent common ancestry of human Y chromosomes: evidence from DNA sequence data. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[42]  Elad Ziv,et al.  Human population structure and genetic association studies. , 2003, Pharmacogenomics.

[43]  Mark D. Shriver,et al.  Genetic ancestry and the search for personalized genetic histories , 2004, Nature Reviews Genetics.

[44]  Stephen F. Schaffner,et al.  The X chromosome in population genetics , 2004, Nature Reviews Genetics.

[45]  L. Excoffier,et al.  Analysis of molecular variance inferred from metric distances among DNA haplotypes: application to human mitochondrial DNA restriction data. , 1992, Genetics.

[46]  Peter Donnelly,et al.  A comparison of bayesian methods for haplotype reconstruction from population genotype data. , 2003, American journal of human genetics.

[47]  Chris Tyler-Smith,et al.  The human Y chromosome: an evolutionary marker comes of age , 2003, Nature Reviews Genetics.

[48]  C. Tyler-Smith,et al.  Human Evolutionary Genetics , 2004 .

[49]  S. Gabriel,et al.  Assessing the impact of population stratification on genetic association studies , 2004, Nature Genetics.

[50]  K. Weiss,et al.  Admixture as a tool for finding linked genes and detecting that difference from allelic association between loci. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[51]  Yun-Xin Fu,et al.  Estimating Effective Population Size or Mutation Rate With Microsatellites , 2004, Genetics.

[52]  Rui Mei,et al.  Large-scale SNP analysis reveals clustered and continuous patterns of human genetic variation , 2005, Human Genomics.

[53]  R. Houlston,et al.  A novel gene for neonatal diabetes maps to chromosome 10p12.1-p13. , 2003, Diabetes.