On meta‐ and mega‐analyses for gene–environment interactions

Gene‐by‐environment (G × E) interactions are important in explaining the missing heritability and understanding the causation of complex diseases, but a single, moderately sized study often has limited statistical power to detect such interactions. With the increasing need for integrating data and reporting results from multiple collaborative studies or sites, debate over choice between mega‐ versus meta‐analysis continues. In principle, data from different sites can be integrated at the individual level into a “mega” data set, which can be fit by a joint “mega‐analysis.” Alternatively, analyses can be done at each site, and results across sites can be combined through a “meta‐analysis” procedure without integrating individual level data across sites. Although mega‐analysis has been advocated in several recent initiatives, meta‐analysis has the advantages of simplicity and feasibility, and has recently led to several important findings in identifying main genetic effects. In this paper, we conducted empirical and simulation studies, using data from a G × E study of lung cancer, to compare the mega‐ and meta‐analyses in four commonly used G × E analyses under the scenario that the number of studies is small and sample sizes of individual studies are relatively large. We compared the two data integration approaches in the context of fixed effect models and random effects models separately. Our investigations provide valuable insights in understanding the differences between mega‐ and meta‐analyses in practice of combining small number of studies in identifying G × E interactions.

[1]  Marcia M. Nizzari,et al.  Candidate Gene Association Resource (CARe): Design, Methods, and Proof of Concept , 2010, Circulation. Cardiovascular genetics.

[2]  Judy H. Cho,et al.  Finding the missing heritability of complex diseases , 2009, Nature.

[3]  R. Doll,et al.  The causes of cancer: quantitative estimates of avoidable risks of cancer in the United States today. , 1981, Journal of the National Cancer Institute.

[4]  M. Khoury,et al.  An epidemiologic approach to ecogenetics. , 1988, American journal of human genetics.

[5]  D Y Lin,et al.  Meta‐analysis of genome‐wide association studies: no efficiency gain in using individual participant data , 2009, Genetic epidemiology.

[6]  L. Stewart,et al.  To IPD or not to IPD? , 2002, Evaluation & the health professions.

[7]  D Zeng,et al.  Equity-specific effects of interventions to promote physical activity among middle-aged and older adults: results from applying a novel equity-specific re-analysis strategy , 2021, International Journal of Behavioral Nutrition and Physical Activity.

[8]  N. Laird,et al.  Meta-analysis in clinical trials. , 1986, Controlled clinical trials.

[9]  J. Little,et al.  N-acetyltransferase polymorphisms and colorectal cancer: a HuGE review. , 2000, American journal of epidemiology.

[10]  C. Varin,et al.  Improving the accuracy of likelihood-based inference in meta-analysis and meta-regression , 2015, 1509.00650.

[11]  K. Lunetta,et al.  Methods in Genetics and Clinical Interpretation Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium Design of Prospective Meta-Analyses of Genome-Wide Association Studies From 5 Cohorts , 2010 .

[12]  L. Le Marchand,et al.  Design Considerations for Genomic Association Studies: Importance of Gene-Environment Interactions , 2008, Cancer Epidemiology Biomarkers & Prevention.

[13]  E. Zomawia,et al.  Multiple Analytical Approaches Reveal Distinct Gene-Environment Interactions in Smokers and Non Smokers in Lung Cancer , 2011, PloS one.

[14]  Eric Boerwinkle,et al.  An Empirical Comparison of Meta‐analysis and Mega‐analysis of Individual Participant Data for Identifying Gene‐Environment Interactions , 2014, Genetic epidemiology.

[15]  Tanya M. Teslovich,et al.  Association analyses of 249,796 individuals reveal 18 new loci associated with body mass index , 2010 .

[16]  N E Day,et al.  The design of case-control studies: the influence of confounding and interaction effects. , 1984, International journal of epidemiology.

[17]  A. Cassidy,et al.  Lung cancer risk prediction: A tool for early detection , 2007, International journal of cancer.

[18]  Ayellet V. Segrè,et al.  Hundreds of variants clustered in genomic loci and biological pathways affect human height , 2010, Nature.

[19]  Joel Eriksson,et al.  FTO genotype is associated with phenotypic variability of body mass index , 2012, Nature.

[20]  D. Thomas,et al.  Gene–environment-wide association studies: emerging approaches , 2010, Nature Reviews Genetics.

[21]  T. Beaty,et al.  Detection of genotype-environment interaction in case-control studies of birth defects: how big a sample size? , 1995, Teratology.

[22]  D. Zeng,et al.  On the relative efficiency of using summary statistics versus individual-level data in meta-analysis. , 2010, Biometrika.

[23]  Chris P. Sarnowski,et al.  Development of a genotyping microarray for studying the role of gene-environment interactions in risk for lung cancer. , 2013, Journal of biomolecular techniques : JBT.

[24]  Tom R. Gaunt,et al.  Genetic Variants in Novel Pathways Influence Blood Pressure and Cardiovascular Disease Risk , 2011, Nature.

[25]  P. O’Reilly,et al.  Identification of seven loci affecting mean telomere length and their association with disease , 2013, Nature Genetics.

[26]  I Olkin,et al.  Comparison of effect estimates from a meta-analysis of summary data from published studies and from a meta-analysis using individual patient data for ovarian cancer studies. , 1997, American journal of epidemiology.

[27]  Xiao-Hua Zhou,et al.  Statistical Methods for Meta‐Analysis , 2008 .