A Nonparametric Test to Detect Quantitative Trait Loci Where the Phenotypic Distribution Differs by Genotypes

Searching for genetic variants involved in gene‐gene and gene‐environment interactions in large‐scale data raises multiple methodological issues. Many existing methods have focused on the problem of dimensionality, trying to explore the largest number of combinations between risk factors while considering simple interaction models. Despite evidence demonstrating the efficacy of these methods in simulated data, their application in real data has been unsuccessful so far. The classical test of a linear marginal genetic effect has been widely used for agnostic genome‐wide association studies, with the underlying idea that most variants involved in interactions might display marginal effect on the phenotypic mean. Although this approach may allow for the identification of genetic variants involved in interactions in many scenarios, the linear marginal effects of some causal alleles on the phenotypic mean might not be always detectable at genome‐wide significance level. We introduce in this study a general association test for quantitative trait loci that compare the distributions of phenotypic values by genotypic classes as opposed to most standard tests that compare phenotypic means by genotypic classes. Using simulations we show that in presence of interactions, this approach can be more powerful than the standard test of the linear marginal effect, with a gain of power increasing with increasing interaction effect and decreasing frequencies of the interacting exposures. We demonstrate the potential utility of our method on real data by analyzing mammographic density genome‐wide data from the Nurses’ Health Study.

[1]  J. Hein,et al.  Detecting interacting genetic loci with effects on quantitative traits where the nature and order of the interaction are unknown , 2010, Genetic epidemiology.

[2]  Erica H Brittain,et al.  P-values for the multi-sample kolmogorov-smirnov test using the expanded bonferroni appoximation , 1987 .

[3]  H. Cordell Detecting gene–gene interactions that underlie human diseases , 2009, Nature Reviews Genetics.

[4]  J. H. Moore,et al.  Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. , 2001, American journal of human genetics.

[5]  W. Willett,et al.  A genome-wide association study identifies alleles in FGFR2 associated with risk of sporadic postmenopausal breast cancer , 2007, Nature Genetics.

[6]  L. Cupples,et al.  Bias due to two‐stage residual‐outcome regression analysis in genetic association studies , 2011, Genetic epidemiology.

[7]  Peter Kraft,et al.  Inclusion of gene-gene and gene-environment interactions unlikely to dramatically improve risk prediction for complex diseases. , 2012, American journal of human genetics.

[8]  David M. Reif,et al.  Machine Learning for Detecting Gene-Gene Interactions , 2006, Applied bioinformatics.

[9]  J. Ott,et al.  Selecting SNPs in two‐stage analysis of disease association data: a model‐free approach , 2000, Annals of human genetics.

[10]  N F Boyd,et al.  Symmetry of projection in the quantitative analysis of mammographic images , 1996, European journal of cancer prevention : the official journal of the European Cancer Prevention Organisation.

[11]  R W Doerge,et al.  Naive Application of Permutation Testing Leads to Inflated Type I Error Rates , 2008, Genetics.

[12]  P. Donnelly,et al.  Genome-wide strategies for detecting multiple loci that influence complex diseases , 2005, Nature Genetics.

[13]  Graham A. Colditz,et al.  The Nurses' Health Study: lifestyle and health among women , 2005, Nature Reviews Cancer.

[14]  Beate Ritz,et al.  Genome-Wide Gene-Environment Study Identifies Glutamate Receptor Gene GRIN2A as a Parkinson's Disease Modifier Gene via Interaction with Coffee , 2011, PLoS genetics.

[15]  G. Colditz,et al.  Common genetic variation in IGF1, IGFBP-1, and IGFBP-3 in relation to mammographic density: a cross-sectional study , 2007, Breast Cancer Research.

[16]  D. Hunter Gene–environment interactions in human diseases , 2005, Nature Reviews Genetics.

[17]  P. Sham,et al.  Adjusting for Covariates in Variance Components QTL Linkage Analysis , 2004, Behavior genetics.

[18]  M. LeBlanc,et al.  Increasing the power of identifying gene × gene interactions in genome‐wide association studies , 2008, Genetic epidemiology.

[19]  Matthew Reimherr,et al.  You've Gotta Be Lucky: Coverage and the Elusive Gene–Gene Interaction , 2011, Annals of human genetics.

[20]  David Clayton,et al.  Epidemiological methods for studying genes and environmental factors in complex diseases , 2001, The Lancet.

[21]  Paul M. Ridker,et al.  On the Use of Variance per Genotype as a Tool to Identify Quantitative Trait Interaction Effects: A Report from the Women's Genome Health Study , 2010, PLoS genetics.

[22]  M. Daly,et al.  Genome-wide association studies for common diseases and complex traits , 2005, Nature Reviews Genetics.

[23]  T. Ogihara,et al.  Detection of common single nucleotide polymorphisms synthesizing quantitative trait association of rarer causal variants. , 2011, Genome research.

[24]  Heli Nevanlinna,et al.  The importance of replication in gene-gene interaction studies: multifactor dimensionality reduction applied to a two-stage breast cancer case-control study. , 2008, Carcinogenesis.

[25]  Judy H. Cho,et al.  Finding the missing heritability of complex diseases , 2009, Nature.

[26]  Jaeil Ahn,et al.  Testing gene-environment interaction in large-scale case-control association studies: possible choices and comparisons. , 2012, American journal of epidemiology.

[27]  Andrei Yakovlev,et al.  A C++ Program for the Cramér-Von Mises Two-Sample Test , 2006 .

[28]  Jason H. Moore,et al.  Missing heritability and strategies for finding the underlying causes of complex disease , 2010, Nature Reviews Genetics.

[29]  Jun S. Liu,et al.  Bayesian inference of epistatic interactions in case-control studies , 2007, Nature Genetics.

[30]  Peter Kraft,et al.  Gene-environment interactions in genome-wide association studies: a comparative study of tests applied to empirical studies of type 2 diabetes. , 2012, American journal of epidemiology.

[31]  Peter Kraft,et al.  The Impact of Gene-Environment Dependence and Misclassification in Genetic Association Studies Incorporating Gene-Environment Interactions , 2009, Human Heredity.

[32]  E. Stone,et al.  The genetics of quantitative traits: challenges and prospects , 2009, Nature Reviews Genetics.

[33]  D. Thomas,et al.  Gene–environment-wide association studies: emerging approaches , 2010, Nature Reviews Genetics.

[34]  Peter Kraft,et al.  Exploiting Gene-Environment Interaction to Detect Genetic Associations , 2007, Human Heredity.

[35]  P. Phillips Epistasis — the essential role of gene interactions in the structure and evolution of genetic systems , 2008, Nature Reviews Genetics.

[36]  Peter Kraft,et al.  Common variants in ZNF365 are associated with both mammographic density and breast cancer risk , 2011, Nature Genetics.

[37]  W. Gauderman,et al.  Gene-environment interaction in genome-wide association studies. , 2008, American journal of epidemiology.

[38]  David M. Evans,et al.  Two-Stage Two-Locus Models in Genome-Wide Association , 2006, PLoS genetics.

[39]  P. O'Brien,et al.  Comparing Two Samples: Extensions of the t, Rank-Sum, and Log-Rank Tests , 1988 .

[40]  Michael Mitzenmacher,et al.  Detecting Novel Associations in Large Data Sets , 2011, Science.