论文信息 - Two-stage testing procedures with independent filtering for genome-wide gene-environment interaction. - 字舞流文

Two-stage testing procedures with independent filtering for genome-wide gene-environment interaction.

Several two-stage multiple testing procedures have been proposed to detect gene-environment interaction in genome-wide association studies. In this article, we elucidate general conditions that are required for validity and power of these procedures, and we propose extensions of two-stage procedures using the case-only estimator of gene-treatment interaction in randomized clinical trials. We develop a unified estimating equation approach to proving asymptotic independence between a filtering statistic and an interaction test statistic in a range of situations, including marginal association and interaction in a generalized linear model with a canonical link. We assess the performance of various two-stage procedures in simulations and in genetic studies from Women's Health Initiative clinical trials.

James Y Dai | Charles Kooperberg | James Y. Dai | Ross L Prentice | M. LeBlanc | C. Kooperberg | R. Prentice | Michael Leblanc

[1] Y. Benjamini,et al. Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[2] Bhramar Mukherjee,et al. Exploiting Gene-Environment Independence for Analysis of Case-Control Studies: An Empirical Bayes Approach to Trade Off between Bias and Efficiency , 2006 .

[3] Y. Benjamini,et al. THE CONTROL OF THE FALSE DISCOVERY RATE IN MULTIPLE TESTING UNDER DEPENDENCY , 2001 .

[4] N. Hjort,et al. Frequentist Model Average Estimators , 2003 .

[5] William Wheeler,et al. A multi-stage genome-wide association study of bladder cancer identifies multiple susceptibility loci , 2010, Nature Genetics.

[6] D. Cox,et al. Variation in the FGFR2 Gene and the Effects of Postmenopausal Hormone Therapy on Invasive Breast Cancer , 2009, Cancer Epidemiology, Biomarkers & Prevention.

[7] R. Gentleman,et al. Independent filtering increases detection power for high-throughput experiments , 2010, Proceedings of the National Academy of Sciences.

[8] Christoph Lange,et al. Genomic screening and replication using the same data set in family-based association testing , 2005, Nature Genetics.

[9] M. Olivier. A haplotype map of the human genome , 2003, Nature.

[10] R. Pyke,et al. Logistic disease incidence models and case-control studies , 1979 .

[11] Juan Pablo Lewinger,et al. Efficient genome-wide association testing of gene-environment interaction in case-parent trios. , 2010, American journal of epidemiology.

[12] J. Robins,et al. Estimation of Regression Coefficients When Some Regressors are not Always Observed , 1994 .

[13] M. Olivier. A haplotype map of the human genome. , 2003, Nature.

[14] P S Albert,et al. Limitations of the case-only design for identifying gene-environment interactions. , 2001, American journal of epidemiology.

[15] G. Casella,et al. Statistical Inference , 2003, Encyclopedia of Social Network Analysis and Mining.

[16] H. White. Asymptotic theory for econometricians , 1985 .

[17] Peter Kraft,et al. Gene-environment interactions in genome-wide association studies: a comparative study of tests applied to empirical studies of type 2 diabetes. , 2012, American journal of epidemiology.

[18] Juan Pablo Lewinger,et al. Sample size requirements to detect gene‐environment interactions in genome‐wide association studies , 2011, Genetic epidemiology.

[19] D. Cox,et al. Variation in the FGFR2 Gene and the Effect of a Low-Fat Dietary Pattern on Invasive Breast Cancer , 2010, Cancer Epidemiology, Biomarkers & Prevention.

[20] James Y Dai,et al. Semiparametric Estimation Exploiting Covariate Independence in Two‐Phase Randomized Trials , 2009, Biometrics.

[21] S. Holm. A Simple Sequentially Rejective Multiple Test Procedure , 1979 .

[22] D. Hunter. Gene–environment interactions in human diseases , 2005, Nature Reviews Genetics.

[23] H. White. Maximum Likelihood Estimation of Misspecified Models , 1982 .

[24] James Y. Dai,et al. Genetic variants in the MRPS30 region and postmenopausal breast cancer risk , 2011, Genome Medicine.

[25] M. LeBlanc,et al. Increasing the power of identifying gene × gene interactions in genome‐wide association studies , 2008, Genetic epidemiology.

[26] Jack A. Taylor,et al. Non-hierarchical logistic models and case-only designs for assessing susceptibility in population-based case-control studies. , 1994, Statistics in medicine.

[27] Bhramar Mukherjee,et al. Exploiting Gene‐Environment Independence for Analysis of Case–Control Studies: An Empirical Bayes‐Type Shrinkage Estimator to Trade‐Off between Bias and Efficiency , 2008, Biometrics.

[28] James L. Powell,et al. Efficient Estimation of Linear and Type I Censored Regression Models Under Conditional Quantile Restrictions , 1990, Econometric Theory.

[29] Benjamin A. Logsdon,et al. Simultaneously testing for marginal genetic association and gene-environment interaction. , 2012, American journal of epidemiology.

[30] W. Gauderman,et al. Gene-environment interaction in genome-wide association studies. , 2008, American journal of epidemiology.

[31] C R Weinberg,et al. Designing and analysing case-control studies to exploit independence of genotype and exposure. , 1997, Statistics in medicine.

[32] Nilanjan Chatterjee,et al. Semiparametric maximum likelihood estimation exploiting gene-environment independence in case-control studies , 2005 .

[33] Carolyn Hutter,et al. Powerful Cocktail Methods for Detecting Genome‐Wide Gene‐Environment Interaction , 2012, Genetic epidemiology.

[34] D. Cox,et al. Variation in the FGFR 2 Gene and the Effects of Postmenopausal Hormone Therapy on Invasive Breast Cancer , 2009 .

[35] Robert L. Wolpert,et al. Statistical Inference , 2019, Encyclopedia of Social Network Analysis and Mining.

[36] Iuliana Ionita-Laza,et al. Genomewide weighted hypothesis testing in family-based association studies, with an application to a 100K scan. , 2007, American journal of human genetics.

[37] Jaeil Ahn,et al. Testing gene-environment interaction in large-scale case-control association studies: possible choices and comparisons. , 2012, American journal of epidemiology.

[38] Lihong Qi,et al. Aspects of the design and analysis of high-dimensional SNP studies for disease risk estimation. , 2006, Biostatistics.