To stratify or not to stratify: power considerations for population‐based genome‐wide association studies of quantitative traits

Meta‐analyses of genome‐wide association studies require numerous study partners to conduct pre‐defined analyses and thus simple but efficient analyses plans. Potential differences between strata (e.g. men and women) are usually ignored, but often the question arises whether stratified analyses help to unravel the genetics of a phenotype or if they unnecessarily increase the burden of analyses. To decide whether to stratify or not to stratify, we compare general analytical power computations for the overall analysis with those of stratified analyses considering quantitative trait analyses and two strata. We also relate the stratification problem to interaction modeling and exemplify theoretical considerations on obesity and renal function genetics. We demonstrate that the overall analyses have better power compared to stratified analyses as long as the signals are pronounced in both strata with consistent effect direction. Stratified analyses are advantageous in the case of signals with zero (or very small) effect in one stratum and for signals with opposite effect direction in the two strata. Applying the joint test for a main SNP effect and SNP‐stratum interaction beats both overall and stratified analyses regarding power, but involves more complex models. In summary, we recommend to employ stratified analyses or the joint test to better understand the potential of strata‐specific signals with opposite effect direction. Only after systematic genome‐wide searches for opposite effect direction loci have been conducted, we will know if such signals exist and to what extent stratified analyses can depict loci that otherwise are missed. Genet. Epidemiol. 2011. © 2011 Wiley Periodicals, Inc.35:867‐879, 2011

[1]  Christoph Lange,et al.  Power calculations for a general class of family-based association tests: dichotomous traits. , 2002, American journal of human genetics.

[2]  Jenny Chang-Claude,et al.  Sample size requirements for indirect association studies of gene–environment interactions (G × E) , 2008, Genetic epidemiology.

[3]  E. Génin,et al.  Does accounting for gene‐environment (G×E) interaction increase the power to detect the effect of a gene in a multifactorial disease? , 2003 .

[4]  Kai Wang,et al.  A constrained-likelihood approach to marker-trait association studies. , 2005, American journal of human genetics.

[5]  Uwe Völker,et al.  New loci associated with kidney function and chronic kidney disease , 2010, Nature Genetics.

[6]  Thomas Meitinger,et al.  Meta-analysis identifies 13 new loci associated with waist-hip ratio and reveals sexual dimorphism in the genetic basis of fat distribution , 2010, Nature Genetics.

[7]  G. Abecasis,et al.  Optimal designs for two‐stage genome‐wide association studies , 2007, Genetic epidemiology.

[8]  Pauline C Ng,et al.  Power to Detect Risk Alleles Using Genome-Wide Tag SNP Panels , 2007, PLoS genetics.

[9]  C. Gieger,et al.  SLC2A9 influences uric acid concentrations with pronounced sex-specific effects , 2008, Nature Genetics.

[10]  H. Löwel,et al.  High prevalence of undiagnosed diabetes mellitus in Southern Germany: Target populations for efficient screening. The KORA survey 2000 , 2003, Diabetologia.

[11]  Peter Kraft,et al.  Exploiting Gene-Environment Interaction to Detect Genetic Associations , 2007, Human Heredity.

[12]  Andrew P Morris,et al.  Meta-analysis of sex-specific genome-wide association studies , 2010, Genetic epidemiology.

[13]  Christoph Lange,et al.  Power and design considerations for a general class of family-based association tests: quantitative traits. , 2002, American journal of human genetics.

[14]  Juliet M Chapman,et al.  Detecting Disease Associations due to Linkage Disequilibrium Using Haplotype Tags: A Class of Tests and the Determinants of Statistical Power , 2003, Human Heredity.

[15]  Josée Dupuis,et al.  Meta‐analysis of gene‐environment interaction: joint estimation of SNP and SNP × environment regression coefficients , 2011, Genetic epidemiology.

[16]  J. Hirschhorn,et al.  Genetic model testing and statistical power in population‐based association studies of quantitative traits , 2007, Genetic epidemiology.

[17]  W James Gauderman,et al.  Sample size requirements for matched case‐control studies of gene–environment interaction , 2002, Statistics in medicine.

[18]  Betsy Jane Becker,et al.  The Synthesis of Regression Slopes in Meta-Analysis. , 2007, 0801.4442.

[19]  Kathryn Roeder,et al.  Analysis of single‐locus tests to detect gene/disease associations , 2005, Genetic epidemiology.

[20]  Silviu-Alin Bacanu,et al.  Comparison of association methods for dense marker data , 2008, Genetic epidemiology.

[21]  James Y Dai,et al.  Structures and Assumptions: Strategies to Harness Gene × Gene and Gene × Environment Interactions in GWAS. , 2009, Statistical science : a review journal of the Institute of Mathematical Statistics.