Sample size determination for studies of gene-environment interaction.

BACKGROUND The search for interaction effects is common in epidemiological studies, but the power of such studies is a major concern. This is a practical issue as many future studies will wish to investigate potential gene-gene and gene-environment interactions and therefore need to be planned on the basis of appropriate sample size calculations. METHODS The underlying model considered in this paper is a simple linear regression and relating a continuous outcome to a continuously distributed exposure variable. RESULTS The slope of the regression line is taken to be dependent on genotype, and the ratio of the slopes for each genotype is considered as the interaction parameter. Sample size is affected by the allele frequency and whether the genetic model is dominant or recessive. It is also critically dependent upon the size of the association between exposure and outcome, and the strength of the interaction term. The link between these determinants is graphically displayed to allow sample size and power to be estimated. An example of the analysis of the association between physical activity and glucose intolerance demonstrates how information from previous studies can be used to determine the sample size required to examine gene-environment interactions. CONCLUSIONS The formulae allowing the computation of the sample size required to study the interaction between a continuous environmental exposure and a genetic factor on a continuous outcome variable should have a practical utility in assisting the design of studies of appropriate power.

[1]  D. Clayton,et al.  Study of genes and environmental factors in complex diseases , 2002, The Lancet.

[2]  Sarah Parish,et al.  Plasma fibrinogen and fibrinogen genotypes in 4685 cases of myocardial infarction and in 6002 controls: Test of causality by "Mendelian randomisation" , 2000 .

[3]  M. Wong,et al.  Glucose intolerance and physical inactivity: the relative importance of low habitual energy expenditure and cardiorespiratory fitness. , 2000, American journal of epidemiology.

[4]  T Stürmer,et al.  Potential gain in efficiency and power to detect gene‐environment interactions by matching in case‐control studies , 2000, Genetic epidemiology.

[5]  E. Oord Method to Detect Genotype-Environment Interactions for Quantitative Trait Loci in Association Studies , 1999 .

[6]  A. C. Rencher Linear models in statistics , 1999 .

[7]  J H Lubin,et al.  Power and sample size calculations in case-control studies of gene-environment interactions: comments on different approaches. , 1999, American journal of epidemiology.

[8]  D. Spiegelman,et al.  Power and sample size calculations for case-control studies of gene-environment interactions with a polytomous exposure variable. , 1997, American journal of epidemiology.

[9]  G. Taubes Epidemiology faces its limits. , 1995, Science.

[10]  T H Beaty,et al.  Minimum sample size estimation to detect gene-environment interaction in case-control designs. , 1994, American journal of epidemiology.

[11]  R. Hamman Genetic and environmental determinants of non-insulin-dependent diabetes mellitus (NIDDM). , 1992, Diabetes/metabolism reviews.

[12]  J H Lubin,et al.  On power and sample size for studying features of the relative odds of disease. , 1990, American journal of epidemiology.

[13]  G. Scally Intersalt: an international study of electrolyte excretion and blood pressure. Results for 24 hour urinary sodium and potassium excretion. Intersalt Cooperative Research Group. , 1988, BMJ.

[14]  Jeremiah Stamler,et al.  Intersalt: an international study of electrolyte excretion and blood pressure. Results for 24 hour urinary sodium and potassium excretion. Intersalt Cooperative Research Group. , 1988 .

[15]  R. H. Myers Classical and modern regression with applications , 1986 .

[16]  N E Day,et al.  The design of case-control studies: the influence of confounding and interaction effects. , 1984, International journal of epidemiology.

[17]  S Greenland,et al.  Concepts of interaction. , 1980, American journal of epidemiology.

[18]  E. S. Pearson,et al.  Charts of the power function for analysis of variance tests, derived from the non-central F-distribution. , 1951, Biometrika.

[19]  J. Wolfowitz,et al.  An Introduction to the Theory of Statistics , 1951, Nature.