Comparison of Six Statistics of Genetic Association Regarding Their Ability to Discriminate between Causal Variants and Genetically Linked Markers

Objectives: Genome-wide association (GWA) studies still rely on the common-disease common-variant hypothesis since the assumption is associated with increased power. In GWA studies, polymorphisms are genotyped and their association with disease is investigated. Most of the identified associations are indirect and reflect a shared inheritance of the genotyped markers and genetically linked causal variants. We have compared six statistics of genetic association regarding their ability to discriminate between markers and causal susceptibility variants, including a probability value (Pval) and a Bayes Factor (BF) based on logistic regression, and the attributable familial relative risk (FRR). Methods: We carried out a simulation-based sensitivity analysis to explore several conceivable scenarios. Theoretical results were illustrated by established causal associations with age-related macular degeneration and by using imputed data based on HapMap for a case-control study of breast cancer. Results: Our data indicate that a representation of genetic association by FRRs and BFs generally facilitates the distinction of causal variants. The FRR showed the best discriminative power under most investigated scenarios, but no single statistic outperformed the others in all situations. For example, rare moderate- to low-penetrance variants (allele frequency: 1%, dominant odds ratio: ≤2.0) seem to be best discriminated by BFs. Conclusions: Present results may help to fully utilize the data generated in association studies that take advantage of next generation sequencing and/or multiple imputation based on the 1000 Genomes Project.

[1]  N. Risch Linkage strategies for genetically complex traits. I. Multilocus models. , 1990, American journal of human genetics.

[2]  D. Goldstein,et al.  Uncovering the roles of rare variants in common disease through whole-genome sequencing , 2010, Nature Reviews Genetics.

[3]  Jason H. Moore,et al.  Missing heritability and strategies for finding the underlying causes of complex disease , 2010, Nature Reviews Genetics.

[4]  E. Zeggini,et al.  Ranking of genome-wide association scan signals by different measures. , 2009, International journal of epidemiology.

[5]  H. Brauch,et al.  Common variants in the UBC9 gene encoding the SUMO‐conjugating enzyme are associated with breast tumor grade , 2009, International journal of cancer.

[6]  J. Bermejo Gene-Environment Interactions and Familial Relative Risks , 2008, Human Heredity.

[7]  S Lemeshow,et al.  The importance of assessing the fit of logistic regression models: a case study. , 1991, American journal of public health.

[8]  K. Hemminki,et al.  The ‘Common Disease-Common Variant’ Hypothesis and Familial Risks , 2008, PloS one.

[9]  A. Edwards,et al.  Complement Factor H Polymorphism and Age-Related Macular Degeneration , 2005, Science.

[10]  Judy H. Cho,et al.  Finding the missing heritability of complex diseases , 2009, Nature.

[11]  M. Stephens,et al.  Bayesian statistical methods for genetic association studies , 2009, Nature Reviews Genetics.

[12]  Jonathan Marchini,et al.  Comparing algorithms for genotype imputation. , 2008, American journal of human genetics.

[13]  P. Killeen Beyond statistical inference: a decision theory for science. , 2006, Psychonomic bulletin & review.

[14]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[15]  C. Furlanello,et al.  Variability in GWAS analysis: the impact of genotype calling algorithm inconsistencies , 2010, The Pharmacogenomics Journal.

[16]  J. Ott,et al.  Complement Factor H Polymorphism in Age-Related Macular Degeneration , 2005, Science.

[17]  H. Brauch,et al.  Polymorphisms in the UBC9 and PIAS3 genes of the SUMO-conjugating system and breast cancer risk , 2010, Breast Cancer Research and Treatment.

[18]  F. Harrell,et al.  Regression modelling strategies for improved prognostic prediction. , 1984, Statistics in medicine.

[19]  Jon Wakefield,et al.  Bayes factors for genome‐wide association studies: comparison with P‐values , 2009, Genetic epidemiology.

[20]  R. J. Boik Contrasts and Effect Sizes in Behavioral Research: A Correlational Approach , 2001 .

[21]  N. Nagelkerke,et al.  A note on a general definition of the coefficient of determination , 1991 .

[22]  C. Fox,et al.  Common Variants in the Adiponectin Gene (ADIPOQ) Associated With Plasma Adiponectin Levels, Type 2 Diabetes, and Diabetes-Related Quantitative Traits , 2008, Diabetes.

[23]  Nils Lid Hjort,et al.  Goodness‐of‐fit processes for logistic regression: simulation results , 2002, Statistics in medicine.

[24]  B. Maher Personal genomes: The case of the missing heritability , 2008, Nature.

[25]  Peter R. Kileen Beyond statistical inference: A decision theory for science , 2006 .

[26]  P. Donnelly,et al.  A new multipoint method for genome-wide association studies by imputation of genotypes , 2007, Nature Genetics.

[27]  David B. Goldstein,et al.  Rare Variants Create Synthetic Genome-Wide Associations , 2010, PLoS biology.

[28]  K. Hemminki,et al.  Constraints for genetic association studies imposed by attributable fraction and familial risk. , 2006, Carcinogenesis.

[29]  Daniel F. Gudbjartsson,et al.  Parental origin of sequence variants associated with complex diseases , 2009, Nature.

[30]  Thomas W. Mühleisen,et al.  Large recurrent microdeletions associated with schizophrenia , 2008, Nature.

[31]  M. McCarthy,et al.  Meta-analysis of genome-wide association data and large-scale replication identifies additional susceptibility loci for type 2 diabetes , 2008, Nature Genetics.

[32]  D. Lin,et al.  Simple and efficient analysis of disease association with missing genotype data. , 2008, American journal of human genetics.

[33]  D. Rubin,et al.  Contrasts and Effect Sizes in Behavioral Research: A Correlational Approach , 1999 .

[34]  P. Stankiewicz,et al.  Structural variation in the human genome and its role in disease. , 2010, Annual review of medicine.

[35]  J. James,et al.  Frequency in relatives for an all‐or‐none trait , 1971, Annals of human genetics.