Integrating common and rare genetic variation in diverse human populations

Despite great progress in identifying genetic variants that influence human disease, most inherited risk remains unexplained. A more complete understanding requires genome-wide studies that fully examine less common alleles in populations with a wide range of ancestry. To inform the design and interpretation of such studies, we genotyped 1.6 million common single nucleotide polymorphisms (SNPs) in 1,184 reference individuals from 11 global populations, and sequenced ten 100-kilobase regions in 692 of these individuals. This integrated data set of common and rare alleles, called ‘HapMap 3’, includes both SNPs and copy number polymorphisms (CNPs). We characterized population-specific differences among low-frequency variants, measured the improvement in imputation accuracy afforded by the larger reference panel, especially in imputing SNPs with a minor allele frequency of ≤5%, and demonstrated the feasibility of imputing newly discovered CNPs and SNPs. This expanded public resource of genome variants in global populations supports deeper interrogation of genomic variation and its role in human disease, and serves as a step towards a high-resolution map of the landscape of human genetic variation.

[1]  W. Irwin "Where do we go from here?". , 1951, Radiography.

[2]  International Human Genome Sequencing Consortium Initial sequencing and analysis of the human genome , 2001, Nature.

[3]  M. Daly,et al.  A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms , 2001, Nature.

[4]  J. V. Moran,et al.  Initial sequencing and analysis of the human genome. , 2001, Nature.

[5]  M. Olivier A haplotype map of the human genome , 2003, Nature.

[6]  R. Gibbs,et al.  SNPdetector: A Software Tool for Sensitive and Accurate SNP Detection , 2005, PLoS Comput. Biol..

[7]  M. Olivier A haplotype map of the human genome. , 2003, Nature.

[8]  Keith C. Cheng,et al.  SLC24A5, a Putative Cation Exchanger, Affects Pigmentation in Zebrafish and Humans , 2005, Science.

[9]  Deborah A Nickerson,et al.  Genomic regions exhibiting positive selection identified from dense genotype data. , 2005, Genome research.

[10]  D. Conrad,et al.  Global variation in copy number in the human genome , 2006, Nature.

[11]  Pardis C Sabeti,et al.  Positive Natural Selection in the Human Lineage , 2006, Science.

[12]  M. Daly,et al.  Biases and reconciliation in estimates of linkage disequilibrium in the human genome. , 2006, American journal of human genetics.

[13]  J. Mullikin,et al.  Measurement of the human allele frequency spectrum demonstrates greater genetic drift in East Asians than in Europeans , 2007, Nature Genetics.

[14]  P. Deloukas,et al.  A genome-wide association study for celiac disease identifies risk variants in the region harboring IL2 and IL21 , 2007, Nature Genetics.

[15]  C. Yau,et al.  QuantiSNP: an Objective Bayes Hidden-Markov Model to detect and accurately map copy number variation using SNP genotyping data , 2007, Nucleic acids research.

[16]  Michael Inouye,et al.  A genotype calling algorithm for the Illumina BeadArray platform , 2007, Bioinform..

[17]  Simon C. Potter,et al.  Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls , 2007, Nature.

[18]  Zhaohui S. Qin,et al.  A second generation human haplotype map of over 3.1 million SNPs , 2007, Nature.

[19]  Peter Donnelly,et al.  Progress and challenges in genome-wide association studies in humans , 2008, Nature.

[20]  Joshua M. Korn,et al.  Integrated detection and population-genetic analysis of SNPs and copy number variation , 2008, Nature Genetics.

[21]  S. Tishkoff,et al.  African genetic diversity: implications for human demographic history, modern human origins, and complex disease mapping. , 2008, Annual review of genomics and human genetics.

[22]  Joshua M. Korn,et al.  Integrated genotype calling and association analysis of SNPs, common copy number polymorphisms and rare CNVs , 2008, Nature Genetics.

[23]  Tomas W. Fitzgerald,et al.  A robust statistical method for case-control association testing with copy number variation , 2008, Nature Genetics.

[24]  Judy H. Cho,et al.  Finding the missing heritability of complex diseases , 2009, Nature.

[25]  Joshua M Akey,et al.  Where do we go from here? Constructing genomic maps of positive selection in humans: , 2009 .

[26]  Joseph K. Pickrell,et al.  Signals of recent positive selection in a worldwide sample of human populations. , 2009, Genome research.

[27]  N. Orr,et al.  A Genome Scan for Positive Selection in Thoroughbred Horses , 2009, PloS one.

[28]  Or Zuk,et al.  A Composite of Multiple Signals Distinguishes Causal Variants in Regions of Positive Selection , 2010, Science.

[29]  Tomas W. Fitzgerald,et al.  Origins and functional impact of copy number variation in the human genome , 2010, Nature.