Founder population-specific HapMap panel increases power in GWA studies through improved imputation accuracy and CNV tagging.

The combining of genome-wide association (GWA) data across populations represents a major challenge for massive global meta-analyses. Genotype imputation using densely genotyped reference samples facilitates the combination of data across different genotyping platforms. HapMap data is typically used as a reference for single nucleotide polymorphism (SNP) imputation and tagging copy number polymorphisms (CNPs). However, the advantage of having population-specific reference panels for founder populations has not been evaluated. We looked at the properties and impact of adding 81 individuals from a founder population to HapMap3 reference data on imputation quality, CNP tagging, and power to detect association in simulations and in an independent cohort of 2138 individuals. The gain in SNP imputation accuracy was highest among low-frequency markers (minor allele frequency [MAF] < 5%), for which adding the population-specific samples to the reference set increased the median R(2) between imputed and genotyped SNPs from 0.90 to 0.94. Accuracy also increased in regions with high recombination rates. Similarly, a reference set with population-specific extension facilitated the identification of better tag-SNPs for a subset of CNPs; for 4% of CNPs the R(2) between SNP genotypes and CNP intensity in the independent population cohort was at least twice as high as without the extension. We conclude that even a relatively small population-specific reference set yields considerable benefits in SNP imputation, CNP tagging accuracy, and the power to detect associations in founder populations and population isolates in particular.

[1]  Chiara Sabatti,et al.  Magnitude and distribution of linkage disequilibrium in population isolates and implications for genome-wide association studies , 2006, Nature Genetics.

[2]  H. Nevanlinna The Finnish population structure. A genetic and genealogical study. , 2009, Hereditas.

[3]  P. Donnelly,et al.  A new multipoint method for genome-wide association studies by imputation of genotypes , 2007, Nature Genetics.

[4]  Pardis C Sabeti,et al.  Common deletion polymorphisms in the human genome , 2006, Nature Genetics.

[5]  T Varilo,et al.  Molecular genetics of the Finnish disease heritage. , 1999, Human molecular genetics.

[6]  A. Innes,et al.  Unique disease heritage of the Dutch‐German Mennonite population , 2008, American journal of medical genetics. Part A.

[7]  D. Conrad,et al.  Global variation in copy number in the human genome , 2006, Nature.

[8]  Manuel A. R. Ferreira,et al.  PLINK: a tool set for whole-genome association and population-based linkage analyses. , 2007, American journal of human genetics.

[9]  E. Eichler,et al.  Systematic assessment of copy number variant detection via genome-wide SNP genotyping , 2008, Nature Genetics.

[10]  Ralf Herwig,et al.  Computational analysis of genome-wide DNA methylation during the differentiation of human embryonic stem cells along the endodermal lineage. , 2010, Genome research.

[11]  E. Eichler,et al.  Linkage disequilibrium and heritability of copy-number polymorphisms within duplicated regions of the human genome. , 2006, American journal of human genetics.

[12]  Sharon R Grossman,et al.  Integrating common and rare genetic variation in diverse human populations , 2010, Nature.

[13]  Toshihiro Tanaka The International HapMap Project , 2003, Nature.

[14]  L. Peltonen,et al.  Linkage disequilibrium in isolated populations: Finland and a young sub-population of Kuusamo , 2000, European Journal of Human Genetics.

[15]  Tomas W. Fitzgerald,et al.  A robust statistical method for case-control association testing with copy number variation , 2008, Nature Genetics.

[16]  Andrew Collins,et al.  The genome-wide patterns of variation expose significant substructure in a founder population. , 2008, American journal of human genetics.

[17]  P. Donnelly,et al.  A Flexible and Accurate Genotype Imputation Method for the Next Generation of Genome-Wide Association Studies , 2009, PLoS genetics.

[18]  M. McCarthy,et al.  Genome-wide association studies for complex traits: consensus, uncertainty and challenges , 2008, Nature Reviews Genetics.

[19]  Rapid haplotype reconstruction in dense marker maps. , 2002 .

[20]  G. Abecasis,et al.  Genotype imputation. , 2009, Annual review of genomics and human genetics.