Enhanced Statistical Tests for GWAS in Admixed Populations: Assessment using African Americans from CARe and a Breast Cancer Consortium

While genome-wide association studies (GWAS) have primarily examined populations of European ancestry, more recent studies often involve additional populations, including admixed populations such as African Americans and Latinos. In admixed populations, linkage disequilibrium (LD) exists both at a fine scale in ancestral populations and at a coarse scale (admixture-LD) due to chromosomal segments of distinct ancestry. Disease association statistics in admixed populations have previously considered SNP association (LD mapping) or admixture association (mapping by admixture-LD), but not both. Here, we introduce a new statistical framework for combining SNP and admixture association in case-control studies, as well as methods for local ancestry-aware imputation. We illustrate the gain in statistical power achieved by these methods by analyzing data of 6,209 unrelated African Americans from the CARe project genotyped on the Affymetrix 6.0 chip, in conjunction with both simulated and real phenotypes, as well as by analyzing the FGFR2 locus using breast cancer GWAS data from 5,761 African-American women. We show that, at typed SNPs, our method yields an 8% increase in statistical power for finding disease risk loci compared to the power achieved by standard methods in case-control studies. At imputed SNPs, we observe an 11% increase in statistical power for mapping disease loci when our local ancestry-aware imputation framework and the new scoring statistic are jointly employed. Finally, we show that our method increases statistical power in regions harboring the causal SNP in the case when the causal SNP is untyped and cannot be imputed. Our methods and our publicly available software are broadly applicable to GWAS in admixed populations.

[1]  G. Abecasis,et al.  Genotype imputation. , 2009, Annual review of genomics and human genetics.

[2]  P. Donnelly,et al.  A new multipoint method for genome-wide association studies by imputation of genotypes , 2007, Nature Genetics.

[3]  P. Donnelly,et al.  A Flexible and Accurate Genotype Imputation Method for the Next Generation of Genome-Wide Association Studies , 2009, PLoS genetics.

[4]  J. Pritchard,et al.  Linkage disequilibrium in humans: models and data. , 2001, American journal of human genetics.

[5]  M. McCarthy,et al.  Genome-wide association studies for complex traits: consensus, uncertainty and challenges , 2008, Nature Reviews Genetics.

[6]  R. Millikan,et al.  The Carolina Breast Cancer Study: integrating population-based epidemiology and molecular biology , 1995, Breast Cancer Research and Treatment.

[7]  Alice S Whittemore,et al.  Prevalence of pathogenic BRCA1 mutation carriers in 5 US racial/ethnic groups. , 2007, JAMA.

[8]  C. Hoggart,et al.  Relation of risk of systemic lupus erythematosus to west African admixture in a Caribbean population , 2003, Human Genetics.

[9]  D. Reich,et al.  Population Structure and Eigenanalysis , 2006, PLoS genetics.

[10]  Eran Halperin,et al.  Inference of locus-specific ancestry in closely related populations , 2009, Bioinform..

[11]  R. Gillum The epidemiology of cardiovascular disease in black Americans. , 1996, The New England journal of medicine.

[12]  J. Marchini,et al.  Genotype imputation for genome-wide association studies , 2010, Nature Reviews Genetics.

[13]  Norman Boyd,et al.  The Breast Cancer Family Registry: an infrastructure for cooperative multinational, interdisciplinary and translational studies of the genetic epidemiology of breast cancer , 2004, Breast Cancer Research.

[14]  David C. Wilson,et al.  Diverse genome-wide association studies associate the IL12/IL23 pathway with Crohn Disease. , 2009, American journal of human genetics.

[15]  B. Browning,et al.  A unified approach to genotype imputation and haplotype-phase inference for large data sets of trios and unrelated individuals. , 2009, American journal of human genetics.

[16]  M. McCarthy,et al.  Meta-analysis of genome-wide association data and large-scale replication identifies additional susceptibility loci for type 2 diabetes , 2008, Nature Genetics.

[17]  Rahul C. Deo,et al.  A High-Density Admixture Scan in 1,670 African Americans with Hypertension , 2007, PLoS genetics.

[18]  W. Willett,et al.  A genome-wide association study identifies alleles in FGFR2 associated with risk of sporadic postmenopausal breast cancer , 2007, Nature Genetics.

[19]  Ion I. Mandoiu,et al.  Imputation-Based Local Ancestry Inference in Admixed Populations , 2009, ISBRA.

[20]  Daniel W. Jones,et al.  Risk factors for coronary heart disease in African Americans: the atherosclerosis risk in communities study, 1987-1997. , 2002, Archives of internal medicine.

[21]  Scott M. Williams,et al.  A high-density admixture map for disease gene discovery in african americans. , 2004, American journal of human genetics.

[22]  Zachary A. Szpiech,et al.  Genome-wide association studies in diverse populations , 2010, Nature Reviews Genetics.

[23]  M. Nalls,et al.  Admixture mapping of white cell count: genetic locus responsible for lower white blood cell count in the Health ABC and Jackson Heart studies. , 2008, American journal of human genetics.

[24]  J. Long,et al.  Evaluation of 11 Breast Cancer Susceptibility Loci in African-American Women , 2009, Cancer Epidemiology, Biomarkers & Prevention.

[25]  D. Reich,et al.  MYH9 is associated with nondiabetic end-stage renal disease in African Americans , 2008, Nature Genetics.

[26]  M. Stephens,et al.  Bayesian statistical methods for genetic association studies , 2009, Nature Reviews Genetics.

[27]  D. Reich,et al.  Principal components analysis corrects for stratification in genome-wide association studies , 2006, Nature Genetics.

[28]  David Reich,et al.  A whole-genome admixture scan finds a candidate locus for multiple sclerosis susceptibility , 2005, Nature Genetics.

[29]  P. Stolley,et al.  Uterine leiomyomas. Racial differences in severity, symptoms and age at diagnosis. , 1996, The Journal of reproductive medicine.

[30]  P C Prorok,et al.  Design of the Prostate, Lung, Colorectal and Ovarian (PLCO) Cancer Screening Trial. , 2000, Controlled clinical trials.

[31]  E. Harris,et al.  Black-white differences in risk of developing retinopathy among individuals with type 2 diabetes. , 1999, Diabetes care.

[32]  Eran Halperin,et al.  Leveraging genetic variability across populations for the identification of causal variants. , 2010, American journal of human genetics.

[33]  P. Wingo,et al.  The NICHD Women's Contraceptive and Reproductive Experiences Study: methods and operational results. , 2002, Annals of epidemiology.

[34]  P. Sullivan,et al.  Genome-Wide Association Study Implicates Chromosome 9q21.31 as a Susceptibility Locus for Asthma in Mexican Children , 2009, PLoS genetics.

[35]  Ching-Yu Cheng,et al.  Admixture Mapping of 15,280 African Americans Identifies Obesity Susceptibility Loci on Chromosomes 5 and X , 2009, PLoS genetics.

[36]  Giske Ursin,et al.  FGFR2 variants and breast cancer risk: fine-scale mapping using African American studies and analysis of chromatin conformation. , 2009, Human molecular genetics.

[37]  D. Reich,et al.  Effects of cis and trans Genetic Ancestry on Gene Expression in African Americans , 2008, PLoS genetics.

[38]  A. Whittemore,et al.  Admixture mapping identifies 8q24 as a prostate cancer risk locus in African-American men , 2006, Proceedings of the National Academy of Sciences.

[39]  D. Reich,et al.  Sensitive Detection of Chromosomal Segments of Distinct Ancestry in Admixed Populations , 2009, PLoS genetics.

[40]  Yongtao Guan,et al.  Practical Issues in Imputation-Based Association Mapping , 2008, PLoS genetics.

[41]  Donald W. Bowden,et al.  Genome-Wide Association Study of Coronary Heart Disease and Its Risk Factors in 8,090 African Americans: The NHLBI CARe Project , 2011, PLoS genetics.

[42]  Gonçalo Abecasis,et al.  Genotype-imputation accuracy across worldwide human populations. , 2009, American journal of human genetics.

[43]  C. Rotimi,et al.  Hypertension in blacks. , 1997, American journal of hypertension.

[44]  W. Isaacs,et al.  Explaining racial differences in prostate cancer in the United States: Sociology or biology? , 2005, The Prostate.

[45]  Ying Sun,et al.  Identification of Type 2 Diabetes Genes in Mexican Americans Through Genome-Wide Association Studies , 2007, Diabetes.

[46]  M. Daly,et al.  Methods for high-density admixture mapping of disease genes. , 2004, American journal of human genetics.

[47]  T. Rebbeck,et al.  Pairwise Combinations of Estrogen Metabolism Genotypes in Postmenopausal Breast Cancer Etiology , 2007, Cancer Epidemiology Biomarkers & Prevention.

[48]  Tanya M. Teslovich,et al.  Biological, Clinical, and Population Relevance of 95 Loci for Blood Lipids , 2010, Nature.

[49]  Sharon R Grossman,et al.  Integrating common and rare genetic variation in diverse human populations , 2010, Nature.

[50]  Charles Rotimi,et al.  A Genome-Wide Association Study of Hypertension and Blood Pressure in African Americans , 2009, PLoS genetics.

[51]  J. Palmer,et al.  Dual effect of parity on breast cancer risk in African-American women. , 2003, Journal of the National Cancer Institute.

[52]  Eran Halperin,et al.  A generic coalescent‐based framework for the selection of a reference panel for imputation , 2010, Genetic epidemiology.

[53]  R. Freimanis,et al.  Polygenic model of DNA repair genetic polymorphisms in human breast cancer risk. , 2008, Carcinogenesis.

[54]  Montgomery Slatkin,et al.  Linkage disequilibrium — understanding the evolutionary past and mapping the medical future , 2008, Nature Reviews Genetics.

[55]  S. Ingles,et al.  Sun exposure, vitamin D receptor gene polymorphisms, and breast cancer risk in a multiethnic population. , 2007, American journal of epidemiology.

[56]  Zhaohui S. Qin,et al.  A second generation human haplotype map of over 3.1 million SNPs , 2007, Nature.

[57]  H. Valdimarsdottir,et al.  Conducting Molecular Epidemiological Research in the Age of HIPAA: A Multi-Institutional Case-Control Study of Breast Cancer in African-American and European-American Women , 2009, Journal of oncology.

[58]  David Reich,et al.  Principal component analysis of genetic data , 2008, Nature Genetics.

[59]  D O Stram,et al.  A multiethnic cohort in Hawaii and Los Angeles: baseline characteristics. , 2000, American journal of epidemiology.

[60]  K. Roeder,et al.  Genomic Control for Association Studies , 1999, Biometrics.

[61]  N. Risch,et al.  Admixture mapping for hypertension loci with genome-scan markers , 2005, Nature Genetics.

[62]  S. O’Brien,et al.  Mapping by admixture linkage disequilibrium: advances, limitations and guidelines , 2005, Nature Reviews Genetics.

[63]  Eleazar Eskin,et al.  Imputation aware meta‐analysis of genome‐wide association studies , 2010, Genetic epidemiology.

[64]  M. Daly,et al.  Genetic Mapping in Human Disease , 2008, Science.