Discovery properties of genome-wide association signals from cumulatively combined data sets.

Genetic effects for common variants affecting complex disease risk are subtle. Single genome-wide association (GWA) studies are typically underpowered to detect these effects, and combination of several GWA data sets is needed to enhance discovery. The authors investigated the properties of the discovery process in simulated cumulative meta-analyses of GWA study-derived signals allowing for potential genetic model misspecification and between-study heterogeneity. Variants with null effects on average (but also between-data set heterogeneity) could yield false-positive associations with seemingly homogeneous effects. Random effects had higher than appropriate false-positive rates when there were few data sets. The log-additive model had the lowest false-positive rate. Under heterogeneity, random-effects meta-analyses of 2-10 data sets averaging 1,000 cases/1,000 controls each did not increase power, or the meta-analysis was even less powerful than a single study (power desert). Upward bias in effect estimates and underestimation of between-study heterogeneity were common. Fixed-effects calculations avoided power deserts and maximized discovery of association signals at the expense of much higher false-positive rates. Therefore, random- and fixed-effects models are preferable for different purposes (fixed effects for initial screenings, random effects for generalizability applications). These results may have broader implications for the design and interpretation of large-scale multiteam collaborative studies discovering common gene variants.

[1]  D A Follmann,et al.  Valid Inference in Random Effects Meta‐Analysis , 1999, Biometrics.

[2]  Evangelos Evangelou,et al.  Heterogeneity in Meta-Analyses of Genome-Wide Association Investigations , 2007, PloS one.

[3]  Graham A. Colditz,et al.  Merging and emerging cohorts: Not worth the wait , 2007, Nature.

[4]  A. Morris,et al.  Fine mapping versus replication in whole-genome association studies. , 2007, American journal of human genetics.

[5]  L. Cardon Delivering New Disease Genes , 2006, Science.

[6]  John P. A. Ioannidis,et al.  The Emergence of Networks in Human Genome Epidemiology: Challenges and Opportunities , 2007, Epidemiology.

[7]  F. Collins,et al.  Merging and emerging cohorts: Necessary but not sufficient , 2007, Nature.

[8]  D. Kiel,et al.  Large-scale analysis of association between LRP5 and LRP6 variants and osteoporosis. , 2008, JAMA.

[9]  John P A Ioannidis,et al.  Required sample size and nonreplicability thresholds for heterogeneous genetic associations , 2008, Proceedings of the National Academy of Sciences.

[10]  Jing Cui,et al.  Common variants at CD40 and other loci confer risk of rheumatoid arthritis , 2008, Nature Genetics.

[11]  Peter Donnelly,et al.  Replicating genotype – phenotype associations What constitutes replication of a genotype – phenotype association , and how best can it be achieved ? , 2007 .

[12]  N. Laird,et al.  Meta-analysis in clinical trials. , 1986, Controlled clinical trials.

[13]  C. Hoggart,et al.  Genome‐wide significance for dense SNP and resequencing data , 2008, Genetic epidemiology.

[14]  J. Pritchard,et al.  Overcoming the winner's curse: estimating penetrance parameters from case-control data. , 2007, American journal of human genetics.

[15]  J. Hirschhorn,et al.  Genetic model testing and statistical power in population‐based association studies of quantitative traits , 2007, Genetic epidemiology.

[16]  R. Collins,et al.  Newly identified loci that influence lipid concentrations and risk of coronary artery disease , 2008, Nature Genetics.

[17]  Simon C. Potter,et al.  Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls , 2007, Nature.

[18]  J. Ioannidis Why Most Discovered True Associations Are Inflated , 2008, Epidemiology.

[19]  G. Abecasis,et al.  Joint analysis is more efficient than replication-based analysis for two-stage genome-wide association studies , 2006, Nature Genetics.

[20]  M. McCarthy,et al.  Meta-analysis of genome-wide association data and large-scale replication identifies additional susceptibility loci for type 2 diabetes , 2008, Nature Genetics.

[21]  M. McCarthy,et al.  Replication of Genome-Wide Association Signals in UK Samples Reveals Risk Loci for Type 2 Diabetes , 2007, Science.

[22]  Mark M Iles,et al.  What Can Genome-Wide Association Studies Tell Us about the Genetics of Common Disease , 2008, PLoS genetics.

[23]  Timothy R. Rebbeck,et al.  Assessing the function of genetic variants in candidate gene association studies , 2004, Nature Reviews Genetics.

[24]  David M. Evans,et al.  Genome-wide association analysis identifies 20 loci that influence adult height , 2008, Nature Genetics.

[25]  Steven Gallinger,et al.  Meta-analysis of genome-wide association data identifies four new susceptibility loci for colorectal cancer , 2008, Nature Genetics.

[26]  John P.A. Ioannidis,et al.  Non-Replication and Inconsistency in the Genome-Wide Association Setting , 2007, Human Heredity.

[27]  K. Mossman The Wellcome Trust Case Control Consortium, U.K. , 2008 .

[28]  P. Donnelly,et al.  Replicating genotype–phenotype associations , 2007, Nature.

[29]  Eden R Martin,et al.  No gene is an island: the flip-flop phenomenon. , 2007, American journal of human genetics.

[30]  K. Abrams,et al.  Bayesian implementation of a genetic model‐free approach to the meta‐analysis of genetic association studies , 2005, Statistics in medicine.

[31]  J. Higgins,et al.  Meta‐analysis of genetic association studies under different inheritance models using data reported as merged genotypes , 2008, Statistics in medicine.

[32]  Julio Sánchez-Meca,et al.  Confidence intervals for the overall effect size in random-effects meta-analysis. , 2008, Psychological methods.

[33]  Muin J. Khoury,et al.  Letting the genome out of the bottle--will we get our wish? , 2008, The New England journal of medicine.

[34]  Siobhan M. Dolan,et al.  Assessment of cumulative evidence on genetic associations: interim guidelines. , 2008, International journal of epidemiology.

[35]  J. Hartung,et al.  On tests of the overall treatment effect in meta‐analysis with normally distributed responses , 2001, Statistics in medicine.

[36]  Xavier Estivill,et al.  Maximizing association statistics over genetic models , 2008, Genetic epidemiology.

[37]  M. McCarthy,et al.  Genome-wide association studies for complex traits: consensus, uncertainty and challenges , 2008, Nature Reviews Genetics.