A second generation human haplotype map of over 3.1 million SNPs

We describe the Phase II HapMap, which characterizes over 3.1 million human single nucleotide polymorphisms (SNPs) genotyped in 270 individuals from four geographically diverse populations and includes 25–35% of common SNP variation in the populations surveyed. The map is estimated to capture untyped common variation with an average maximum r2 of between 0.9 and 0.96 depending on population. We demonstrate that the current generation of commercial genome-wide genotyping products captures common Phase II SNPs with an average maximum r2 of up to 0.8 in African and up to 0.95 in non-African populations, and that potential gains in power in association studies can be obtained through imputation. These data also reveal novel aspects of the structure of linkage disequilibrium. We show that 10–30% of pairs of individuals within a population share at least one region of extended genetic identity arising from recent ancestry and that up to 1% of all common variants are untaggable, primarily because they lie within recombination hotspots. We show that recombination rates vary systematically around genes and between genes of different function. Finally, we demonstrate increased differentiation at non-synonymous, compared to synonymous, SNPs, resulting from systematic differences in the strength or efficacy of natural selection between populations.

Zhaohui S. Qin | Gudmundur A. Thorisson | L. M. Sung | Pardis C Sabeti | P. Donnelly | C. Spencer | W. Guan | C. Freeman | P. Deloukas | A. Morris | J. Marchini | R. Saxena | S. Gabriel | M. Daly | G. Abecasis | A. Auton | L. Brooks | G. McVean | B. Birren | N. Carter | K. Gunderson | R. Wilson | L. Fulton | G. Weinstock | D. Altshuler | R. Gibbs | C. Clee | S. Hunt | T. Hudson | D. Cox | D. Bentley | S. Mccarroll | P. Sham | S. Schaffner | Z. Qin | Huy Nguyen | Jamie M Moore | J. Roy | B. Blumenstiel | M. Defelice | M. Faggart | C. Rotimi | S. Sherry | J. Mullikin | D. Willey | J. Rogers | P. Kwok | Daryl J. Thomas | K. Frazer | M. Stephens | Huanming Yang | Morris W Foster | M. Leppert | N. Patterson | L. Tsui | F. Collins | S. Leal | Xiaoli Tang | L. Cardon | J. Barrett | R. Gwilliam | P. Whittaker | Niall J Cardin | A. Price | D. Muzny | L. Nazareth | D. Wheeler | L. Ziaugra | R. Onofrio | D. Koboldt | Melissa Parkin | M. Guyer | J. Peterson | I. Pe’er | M. Ross | Supriya Gupta | D. Ballinger | D. Hinds | L. Stuve | J. Belmont | A. Boudreau | P. Hardenbol | S. Pasternak | T. Willis | F. Yu | Changqing Zeng | Yang Gao | Haoran Hu | Weitao Hu | Chaohua Li | Wei Lin | Siqi Liu | Hao Pan | Jian Wang | Wei Wang | Jun Yu | B. Zhang | Qingrun Zhang | Hongbin Zhao | Hui-Ping Zhao | Jun Zhou | Rachel Barry | Amy L. Camargo | Marie-Anne Goyette | E. Stahl | E. Winchester | Yan Shen | Zhijian Yao | Wei Huang | X. Chu | Yungang He | Li Jin | Yangfan Liu | Yayun Shen | Weiwei Sun | Hai-feng Wang | Yi Wang | Ying Wang | Xiao-yan Xiong | Liang Xu | M. Waye | S. Tsui | H. Xue | J. Wong | L. Galver | Jian-Bing Fan | S. Murray | A. Oliphant | M. Chee | A. Montpetit | F. Chagnon | V. Ferretti | M. Leboeuf | J. Olivier | M. Phillips | Stéphanie Roumy | C. Sallée | A. Verner | Dongmei Cai | Raymond D. Miller | L. Pawlikowska | P. Taillon-Miller | M. Xiao | W. Mak | You-Qiang Song | P. Tam | Yusuke Nakamura | T. Kawaguchi | T. Kitamoto | T. Morizono | A. Nagashima | Y. Ohnishi | A. Sekine | Toshihiro Tanaka | T. Tsunoda | C. Bird | Marcos Delgado | E. Dermitzakis | J. Morrison | Don Powell | B. Stranger | P. D. de Bakker | Y. Chretien | J. Maller | S. Purcell | D. Richter | P. Varilly | Lincoln D Stein | Lalitha Krishnan | Albert Vernon Smith | M. Tello-Ruiz | A. Chakravarti | Peter E. Chen | D. Cutler | C. Kashuk | Shin Lin | Yun Li | H. Munro | L. Bottolo | S. Eyheramendy | S. Myers | G. Clarke | David M. Evans | B. Weir | M. Feolo | A. Skol | Houcan Zhang | I. Matsuda | Y. Fukushima | D. Macer | Eiko Suda | C. Adebamowo | I. Ajayi | Toyin Aniagwu | P. Marshall | C. Nkwodimmah | C. Royal | M. Dixon | A. Peiffer | Renzong Qiu | A. Kent | Kazuto Kato | N. Niikawa | I. Adewole | B. Knoppers | E. Clayton | Jessica Watkin | E. Sodergren | I. Yakub | J. Burton | M. Griffiths | Matt Jones | K. McLay | R. Plumb | S. Sims | Zhu Chen | Hua Han | L. Kang | M. Godbout | J. Wallenburg | P. L'Archevêque | G. Bellemare | Koji Saeki | Hongguang Wang | Daochang An | H. Fu | Qing Li | Zhen Wang | Ren-hao Wang | A. Holden | J. Mcewen | V. Wang | Michael Shi | J. Spiegel | Lynn F. Zacharia | Karen L Kennedy | R. Jamieson | J. Stewart | Bo Zhang | Haifeng Wang | Niall J. Cardin | Heather M. Munro | Matthew C. Jones | A. Smith | R. Saxena | Lincoln D. Stein | Morris W. Foster | Niall Cardin | Ruth Jamieson | Toyin I G Aniagwu | A. Smith | Huy L. Nguyen | D. Thomas | R. Wilson | Vincent Ferretti | A. Morris | D. Muzny | Jane L. Peterson | Takashi Morizono | A. Price | C. Spencer | Andrew Skol | Hongbo Fu | Jian-Bing Fan | M. DeFelice | A. Morris

[1]  B. Weir,et al.  ESTIMATING F‐STATISTICS FOR THE ANALYSIS OF POPULATION STRUCTURE , 1984, Evolution; international journal of organic evolution.

[2]  E S Lander,et al.  Homozygosity mapping: a way to map human recessive traits with the DNA of inbred children. , 1987, Science.

[3]  Nelson B. Freimer,et al.  Genome screening by searching for shared segments: mapping a gene for benign recurrent intrahepatic cholestasis , 1994, Nature Genetics.

[4]  Y. Fu,et al.  Statistical properties of segregating sites. , 1995, Theoretical population biology.

[5]  L. Sandkuijl,et al.  Perspectives of identity by descent (IBD) mapping in founder populations , 1995, Clinical and experimental allergy : journal of the British Society for Allergy and Clinical Immunology.

[6]  E. Feingold,et al.  Genome scanning for segments shared identical by descent among distant relatives in isolated populations. , 1997, American journal of human genetics.

[7]  J. Kidd,et al.  Recombination Breakpoints in the Human β-Globin Gene Cluster , 1998 .

[8]  J. Kidd,et al.  Recombination Breakpoints in the Human β-Globin Gene Cluster , 1998 .

[9]  M. Cargill Characterization of single-nucleotide polymorphisms in coding regions of human genes , 1999, Nature Genetics.

[10]  J. Weber,et al.  Long homozygous chromosomal segments in reference families from the centre d'Etude du polymorphisme humain. , 1999, American journal of human genetics.

[11]  T. Petes,et al.  Meiotic recombination hot spots and cold spots , 2001, Nature Reviews Genetics.

[12]  J. Wall,et al.  Gene conversion and different population histories may explain the contrast between polymorphism and linkage disequilibrium levels. , 2001, American journal of human genetics.

[13]  M. Shriver,et al.  Interrogating a high-density SNP map for signatures of natural selection. , 2002, Genome research.

[14]  Pardis C Sabeti,et al.  Detecting recent positive selection in the human genome from haplotype structure , 2002, Nature.

[15]  Juliet M Chapman,et al.  Detecting Disease Associations due to Linkage Disequilibrium Using Haplotype Tags: A Class of Tests and the Determinants of Statistical Power , 2003, Human Heredity.

[16]  Toshihiro Tanaka The International HapMap Project , 2003, Nature.

[17]  P. Tam The International HapMap Consortium. The International HapMap Project (Co-PI of Hong Kong Centre which responsible for 2.5% of genome) , 2003 .

[18]  Svante Pääbo,et al.  The mosaic that is our genome , 2003, Nature.

[19]  M. Campbell,et al.  PANTHER: a library of protein families and subfamilies indexed by function. , 2003, Genome research.

[20]  A. Jeffreys,et al.  Intense and highly localized gene conversion activity in human meiotic crossover hot spots , 2004, Nature Genetics.

[21]  P. Donnelly,et al.  The Fine-Scale Structure of Recombination Rate Variation in the Human Genome , 2004, Science.

[22]  M. W. Foster,et al.  Integrating ethics and science in the International HapMap Project , 2004, Nature Reviews Genetics.

[23]  S. Gabriel,et al.  Efficiency and power in genetic association studies , 2005, Nature Genetics.

[24]  P. Donnelly,et al.  Comparison of Fine-Scale Recombination Rates in Humans and Chimpanzees , 2005, Science.

[25]  M. Olivier A haplotype map of the human genome , 2003, Nature.

[26]  Molly Przeworski,et al.  Fine-scale recombination patterns differ between chimpanzees and humans , 2005, Nature Genetics.

[27]  W. G. Hill,et al.  Measures of human population structure show heterogeneity among genomic regions. , 2005, Genome research.

[28]  Geoffrey B. Nilsen,et al.  Whole-Genome Patterns of Common DNA Variation in Three Human Populations , 2005, Science.

[29]  M. Olivier A haplotype map of the human genome. , 2003, Nature.

[30]  Joshua T. Burdick,et al.  Mapping determinants of human gene expression by regional and genome-wide association , 2005, Nature.

[31]  Gil McVean,et al.  Perspectives on Human Genetic Variation from the HapMap Project , 2005, PLoS genetics.

[32]  Ryan D. Hernandez,et al.  Natural selection on protein-coding genes in the human genome , 2005, Nature.

[33]  T. Hudson,et al.  Mapping common regulatory variants to human haplotypes. , 2005, Human molecular genetics.

[34]  S. Hunt,et al.  Genome-Wide Associations of Gene Expression Variation in Humans , 2005, PLoS genetics.

[35]  P. Donnelly,et al.  A Fine-Scale Map of Recombination Rates and Hotspots Across the Human Genome , 2005, Science.

[36]  Gonçalo R Abecasis,et al.  Sequence features in regions of weak and strong linkage disequilibrium. , 2005, Genome research.

[37]  Peter Donnelly,et al.  The Influence of Recombination on Human Genetic Diversity , 2006, PLoS genetics.

[38]  Pardis C Sabeti,et al.  A high-resolution HLA and SNP haplotype map for disease association studies in the extended human MHC , 2006, Nature Genetics.

[39]  M. Daly,et al.  Transferability of tag SNPs in genetic association studies in multiple populations , 2006, Nature Genetics.

[40]  D. Reich,et al.  Population Structure and Eigenanalysis , 2006, PLoS genetics.

[41]  D. Conrad,et al.  A high-resolution survey of deletion polymorphism in the human genome , 2006, Nature Genetics.

[42]  M. Daly,et al.  Evaluating and improving power in whole-genome association studies using fixed marker sets , 2006, Nature Genetics.

[43]  D. Conrad,et al.  Global variation in copy number in the human genome , 2006, Nature.

[44]  N. Morton,et al.  Extended tracts of homozygosity in outbred human populations. , 2006, Human molecular genetics.

[45]  Pardis C Sabeti,et al.  Positive Natural Selection in the Human Lineage , 2006, Science.

[46]  Lon R Cardon,et al.  Evaluating coverage of genome-wide association studies , 2006, Nature Genetics.

[47]  D. Reich,et al.  Principal components analysis corrects for stratification in genome-wide association studies , 2006, Nature Genetics.

[48]  Joshua T. Burdick,et al.  In silico method for inferring genotypes in pedigrees , 2006, Nature Genetics.

[49]  B. Oh,et al.  Comparative study of the linkage disequilibrium of an ENCODE region, chromosome 7p15, in Korean, Japanese, and Han Chinese samples. , 2006, Genomics.

[50]  D. Conrad,et al.  A worldwide survey of haplotype variation and linkage disequilibrium in the human genome , 2006, Nature Genetics.

[51]  Sleeboom-Faulkner Chinese Academy of Social Sciences , 2006, B-Model Gromov-Witten Theory.

[52]  E. Génin,et al.  Using genomic inbreeding coefficient estimates for homozygosity mapping of rare recessive traits: application to Taybi-Linder syndrome. , 2006, American journal of human genetics.

[53]  Pardis C Sabeti,et al.  Common deletion polymorphisms in the human genome , 2006, Nature Genetics.

[54]  P Donnelly,et al.  The distribution and causes of meiotic recombination in the human genome. , 2006, Biochemical Society transactions.

[55]  A. Jeffreys,et al.  Allelic recombination and de novo deletions in sperm in the human beta-globin gene region. , 2006, Human molecular genetics.

[56]  Terence P. Speed,et al.  Genome analysis A genotype calling algorithm for affymetrix SNP arrays , 2005 .

[57]  J. Pritchard,et al.  A Map of Recent Positive Selection in the Human Genome , 2006, PLoS biology.

[58]  P. Donnelly,et al.  A new multipoint method for genome-wide association studies by imputation of genotypes , 2007, Nature Genetics.

[59]  M. Stephens,et al.  Imputation-Based Analysis of Association Studies: Candidate Regions and Quantitative Traits , 2007, PLoS genetics.

[60]  A. Bowcock Genomics: Guilt by association , 2007, Nature.

[61]  C. Sabatti,et al.  Tag SNPs chosen from HapMap perform well in several population isolates , 2007, Genetic epidemiology.

[62]  M. Daly,et al.  Guilt beyond a reasonable doubt , 2007, Nature Genetics.

[63]  Pardis C Sabeti,et al.  Genome-wide detection and characterization of positive selection in human populations , 2007, Nature.

[64]  Simon C. Potter,et al.  Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls , 2007, Nature.

[65]  Manuel A. R. Ferreira,et al.  PLINK: a tool set for whole-genome association and population-based linkage analyses. , 2007, American journal of human genetics.

[66]  G. Abecasis,et al.  A Genome-Wide Association Study of Type 2 Diabetes in Finns Detects Multiple Susceptibility Variants , 2007, Science.

[67]  K. Mossman The Wellcome Trust Case Control Consortium, U.K. , 2008 .

[68]  United Nations Educational, Scientific and Cultural Organization , 2009, Permanent Missions to the United Nations No.299.