Evaluation of the Metabochip Genotyping Array in African Americans and Implications for Fine Mapping of GWAS-Identified Loci: The PAGE Study

The Metabochip is a custom genotyping array designed for replication and fine mapping of metabolic, cardiovascular, and anthropometric trait loci and includes low frequency variation content identified from the 1000 Genomes Project. It has 196,725 SNPs concentrated in 257 genomic regions. We evaluated the Metabochip in 5,863 African Americans; 89% of all SNPs passed rigorous quality control with a call rate of 99.9%. Two examples illustrate the value of fine mapping with the Metabochip in African-ancestry populations. At CELSR2/PSRC1/SORT1, we found the strongest associated SNP for LDL-C to be rs12740374 (p = 3.5×10−11), a SNP indistinguishable from multiple SNPs in European ancestry samples due to high correlation. Its distinct signal supports functional studies elsewhere suggesting a causal role in LDL-C. At CETP we found rs17231520, with risk allele frequency 0.07 in African Americans, to be associated with HDL-C (p = 7.2×10−36). This variant is very rare in Europeans and not tagged in common GWAS arrays, but was identified as associated with HDL-C in African Americans in a single-gene study. Our results, one narrowing the risk interval and the other revealing an associated variant not found in Europeans, demonstrate the advantages of high-density genotyping of common and rare variation for fine mapping of trait loci in African American samples.

[1]  Cedric Gondro,et al.  Quality control for genome-wide association studies. , 2013, Methods in molecular biology.

[2]  Wei Wang,et al.  Genotype Imputation of MetabochipSNPs Using a Study‐Specific Reference Panel of ∼4,000 Haplotypes in African Americans From the Women's Health Initiative , 2012, Genetic epidemiology.

[3]  Claude Bouchard,et al.  Performance of Genotype Imputations Using Data from the 1000 Genomes Project , 2011, Human Heredity.

[4]  Sarah Edkins,et al.  Dense genotyping identifies and localizes multiple common and rare variant association signals in celiac disease , 2011, Nature Genetics.

[5]  C. Carlson,et al.  The Next PAGE in Understanding Complex Traits: Design for the Analysis of Population Architecture Using Genetics and Epidemiology (PAGE) Study , 2011, American journal of epidemiology.

[6]  Luigi Ferrucci,et al.  Allelic heterogeneity and more detailed analyses of known loci explain additional phenotypic variation and reveal complex patterns of association , 2011, Human molecular genetics.

[7]  Gonçalo R. Abecasis,et al.  Fine Mapping of Five Loci Associated with Low-Density Lipoprotein Cholesterol Detects Variants That Double the Explained Heritability , 2011, PLoS genetics.

[8]  Steven Gallinger,et al.  Multiple Common Susceptibility Variants near BMP Pathway Loci GREM1, BMP4, and BMP2 Explain Part of the Missing Heritability of Colorectal Cancer , 2011, PLoS genetics.

[9]  C. Carlson,et al.  Genetic Determinants of Lipid Traits in Diverse Populations from the Population Architecture using Genomics and Epidemiology (PAGE) Study , 2011, PLoS genetics.

[10]  Donald W. Bowden,et al.  Genome-Wide Association Study of Coronary Heart Disease and Its Risk Factors in 8,090 African Americans: The NHLBI CARe Project , 2011, PLoS genetics.

[11]  M. Brown,et al.  Promise and pitfalls of the Immunochip , 2011, Arthritis research & therapy.

[12]  Ruijie Liu,et al.  Comparing genotyping algorithms for Illumina's Infinium whole-genome SNP BeadChips , 2011, BMC Bioinformatics.

[13]  D. Altshuler,et al.  A map of human genome variation from population-scale sequencing , 2010, Nature.

[14]  Sharon R Grossman,et al.  Integrating common and rare genetic variation in diverse human populations , 2010, Nature.

[15]  A. Folsom,et al.  Risk of incident cardiovascular disease among users of smokeless tobacco in the Atherosclerosis Risk in Communities (ARIC) study. , 2010, American journal of epidemiology.

[16]  Olle Melander,et al.  From noncoding variant to phenotype via SORT1 at the 1p13 cholesterol locus , 2010, Nature.

[17]  Tanya M. Teslovich,et al.  Biological, Clinical, and Population Relevance of 95 Loci for Blood Lipids , 2010, Nature.

[18]  Michael Boehnke,et al.  LocusZoom: regional visualization of genome-wide association scan results , 2010, Bioinform..

[19]  Yun Li,et al.  METAL: fast and efficient meta-analysis of genomewide association scans , 2010, Bioinform..

[20]  J. Marchini,et al.  Genotype imputation for genome-wide association studies , 2010, Nature Reviews Genetics.

[21]  Zachary A. Szpiech,et al.  Genome-wide association studies in diverse populations , 2010, Nature Reviews Genetics.

[22]  P. Bork,et al.  A method and server for predicting damaging missense mutations , 2010, Nature Methods.

[23]  D. Kwiatkowski,et al.  Methodological challenges of genome-wide association analysis in Africa , 2010, Nature Reviews Genetics.

[24]  R. Collins,et al.  Common variants at 30 loci contribute to polygenic dyslipidemia , 2009, Nature Genetics.

[25]  Andrew D. Johnson,et al.  SNAP: a web-based tool for identification and annotation of proxy SNPs using HapMap , 2008, Bioinform..

[26]  Eleni Giannoulatou,et al.  GenoSNP: a variational Bayes within-sample SNP genotyping algorithm that does not require a reference population , 2008, Bioinform..

[27]  Zhaohui S. Qin,et al.  A second generation human haplotype map of over 3.1 million SNPs , 2007, Nature.

[28]  Manuel A. R. Ferreira,et al.  PLINK: a tool set for whole-genome association and population-based linkage analyses. , 2007, American journal of human genetics.

[29]  D. Reich,et al.  Population Structure and Eigenanalysis , 2006, PLoS genetics.

[30]  J. Harrow,et al.  GENCODE: producing a reference annotation for ENCODE , 2006, Genome Biology.

[31]  D. Reich,et al.  Principal components analysis corrects for stratification in genome-wide association studies , 2006, Nature Genetics.

[32]  J. Thompson,et al.  Cholesteryl ester transfer protein promoter single‐nucleotide polymorphisms in Sp1‐binding sites affect transcription and are associated with high‐density lipoprotein cholesterol , 2004, Clinical genetics.

[33]  D. Altman,et al.  Measuring inconsistency in meta-analyses , 2003, BMJ : British Medical Journal.

[34]  D O Stram,et al.  A multiethnic cohort in Hawaii and Los Angeles: baseline characteristics. , 2000, American journal of epidemiology.

[35]  JoAnn E. Manson,et al.  Design of the Women's Health Initiative clinical trial and observational study. The Women's Health Initiative Study Group. , 1998, Controlled clinical trials.

[36]  G. Berglund,et al.  Design and feasibility , 1993 .

[37]  Professor Göran Berglund The Malmö Diet and Cancer Study Design, biological bank and biomarker programme , 1993 .

[38]  G. Berglund,et al.  The Malmo Diet and Cancer Study. Design and feasibility. , 1993, Journal of internal medicine.

[39]  Aric Invest The Atherosclerosis Risk in Communities (ARIC) Study: design and objectives. The ARIC investigators , 1989 .

[40]  A. Folsom,et al.  The Atherosclerosis Risk in Communities (ARIC) Study: design and objectives. The ARIC investigators. , 1989, American journal of epidemiology.

[41]  R. Levy,et al.  Estimation of the concentration of low-density lipoprotein cholesterol in plasma, without use of the preparative ultracentrifuge. , 1972, Clinical chemistry.