Hierarchical modeling in association studies of multiple phenotypes

The genetic study of disease-associated phenotypes has become common because such phenotypes are often easier to measure and in many cases are under greater genetic control than the complex disease itself. Some disease-associated phenotypes are rare, however, making it difficult to evaluate their effects due to small informative sample sizes. In addition, analyzing numerous phenotypes introduces the issue of multiple comparisons. To address these issues, we have developed a hierarchical model (HM) for multiple phenotypes that provides more accurate effect estimates with a lower false-positive rate. We evaluated the validity and power of HM in association studies of multiple phenotypes using randomly selected cases and controls from the simulated data set in the Genetic Analysis Workshop 14. In particular, we first analyzed the association between each of the 12 subclinical phenotypes and single-nucleotide polymorphisms within the known causal loci using a conventional logistic regression model (LRM). Then we added a second-stage model by regressing all of the logistic coefficients of the phenotypes obtained from LRM on a Z matrix that incorporates the clinical correlation of the phenotypes. Specially, the 12 phenotypes were grouped into 3 clusters: 1) communally shared emotions; 2) behavioral related; and 3) anxiety related. A semi-Bayes HM effect estimate for each phenotype was calculated and compared with those from LRM. We observed that using HM to evaluate the association between SNPs and multiple related phenotypes slightly increased power for detecting the true associations and also led to fewer false-positive results.

[1]  John S Witte,et al.  Using hierarchical modeling in genetic association studies with multiple markers: application to a case-control study of bladder cancer. , 2004, Cancer epidemiology, biomarkers & prevention : a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology.

[2]  S Greenland,et al.  Simulation study of hierarchical regression. , 1996, Statistics in medicine.

[3]  S Greenland,et al.  Methods for epidemiologic analyses of multiple exposures: a review and comparative study of maximum-likelihood, preliminary-testing, and empirical-Bayes regression. , 1993, Statistics in medicine.

[4]  J. Witte,et al.  Hierarchical modeling of linkage disequilibrium: genetic structure and spatial relations. , 2003, American journal of human genetics.

[5]  S Greenland,et al.  A semi-Bayes approach to the analysis of correlated multiple associations, with an application to an occupational cancer-mortality study. , 1992, Statistics in medicine.

[6]  S Greenland,et al.  Hierarchical Regression Analysis Applied to a Study of Multiple Dietary Exposures and Breast Cancer , 1994, Epidemiology.

[7]  S. Greenland,et al.  Hierarchical modeling of gene-environment interactions: estimating NAT2 genotype-specific dietary effects on adenomatous polyps. , 1997, Cancer epidemiology, biomarkers & prevention : a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology.

[8]  P. Black,et al.  Genetic polymorphisms in GSTM1, -P1, -T1, and CYP2E1 and the risk of adult brain tumors. , 2003, Cancer epidemiology, biomarkers & prevention : a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology.