Genotype‐based association mapping of complex diseases: gene‐environment interactions with multiple genetic markers and measurement error in environmental exposures

With the advent of dense single nucleotide polymorphism genotyping, population‐based association studies have become the major tools for identifying human disease genes and for fine gene mapping of complex traits. We develop a genotype‐based approach for association analysis of case‐control studies of gene‐environment interactions in the case when environmental factors are measured with error and genotype data are available on multiple genetic markers. To directly use the observed genotype data, we propose two genotype‐based models: genotype effect and additive effect models. Our approach offers several advantages. First, the proposed risk functions can directly incorporate the observed genotype data while modeling the linkage disequilibrium information in the regression coefficients, thus eliminating the need to infer haplotype phase. Compared with the haplotype‐based approach, an estimating procedure based on the proposed methods can be much simpler and significantly faster. In addition, there is no potential risk due to haplotype phase estimation. Further, by fitting the proposed models, it is possible to analyze the risk alleles/variants of complex diseases, including their dominant or additive effects. To model measurement error, we adopt the pseudo‐likelihood method by Lobach et al. [ 2008 ]. Performance of the proposed method is examined using simulation experiments. An application of our method is illustrated using a population‐based case‐control study of association between calcium intake with the risk of colorectal adenoma development. Genet. Epidemiol. 34:792‐802, 2010. © 2010 Wiley‐Liss, Inc.

[1]  R. Carroll,et al.  Increased risk of early-stage breast cancer related to consumption of sweet foods among women less than age 45 in the United States , 2002, Cancer Causes & Control.

[2]  M. Olivier A haplotype map of the human genome. , 2003, Nature.

[3]  R. Carroll,et al.  Haplotype‐Based Regression Analysis and Inference of Case–Control Studies with Unphased Genotypes and Measurement Errors in Environmental Exposures , 2008, Biometrics.

[4]  M. Olivier A haplotype map of the human genome , 2003, Nature.

[5]  M. Daly,et al.  A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms , 2001, Nature.

[6]  Alexander Kukush,et al.  Measurement Error Models , 2011, International Encyclopedia of Statistical Science.

[7]  Peter Donnelly,et al.  A comparison of bayesian methods for haplotype reconstruction from population genotype data. , 2003, American journal of human genetics.

[8]  R. Hayes,et al.  Association of genetic variants in the calcium-sensing receptor with risk of colorectal adenoma. , 2004, Cancer epidemiology, biomarkers & prevention : a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology.

[9]  Zhaohui S. Qin,et al.  A second generation human haplotype map of over 3.1 million SNPs , 2007, Nature.

[10]  Bhramar Mukherjee,et al.  Exploiting Gene‐Environment Independence for Analysis of Case–Control Studies: An Empirical Bayes‐Type Shrinkage Estimator to Trade‐Off between Bias and Efficiency , 2008, Biometrics.

[11]  Zhaohui S. Qin,et al.  A comparison of phasing algorithms for trios and unrelated individuals. , 2006, American journal of human genetics.

[12]  Geoffrey B. Nilsen,et al.  Whole-Genome Patterns of Common DNA Variation in Three Human Populations , 2005, Science.

[13]  Zhaohui S. Qin,et al.  Partition-ligation-expectation-maximization algorithm for haplotype inference with single-nucleotide polymorphisms. , 2002, American journal of human genetics.

[14]  A. Chakravarti,et al.  Haplotype inference in random population samples. , 2002, American journal of human genetics.

[15]  A. Schatzkin,et al.  Observational Epidemiologic Studies of Nutrition and Cancer: The Next Generation (with Better Observation) , 2009, Cancer Epidemiology Biomarkers & Prevention.

[16]  High resolution mapping of quantitative trait loci by linkage disequilibrium analysis , 2002, European Journal of Human Genetics.

[17]  Toshihiro Tanaka The International HapMap Project , 2003, Nature.

[18]  Ruzong Fan,et al.  High-Resolution Association Mapping of Quantitative Trait Loci: A Population-Based Approach , 2006, Genetics.

[19]  D. Ruppert,et al.  Measurement Error in Nonlinear Models , 1995 .

[20]  Raymond J Carroll,et al.  Analysis of case‐control studies of genetic and environmental factors with missing genetic information and haplotype‐phase ambiguity , 2005, Genetic epidemiology.

[21]  Nilanjan Chatterjee,et al.  Semiparametric maximum likelihood estimation exploiting gene-environment independence in case-control studies , 2005 .

[22]  P. Donnelly,et al.  A new statistical method for haplotype reconstruction from population data. , 2001, American journal of human genetics.

[23]  D. Midthune,et al.  Using intake biomarkers to evaluate the extent of dietary misreporting in a large sample of adults: the OPEN study. , 2003, American journal of epidemiology.