Bivariate association analyses for the mixture of continuous and binary traits with the use of extended generalized estimating equations

Genome‐wide association (GWA) study is becoming a powerful tool in deciphering genetic basis of complex human diseases/traits. Currently, the univariate analysis is the most commonly used method to identify genes associated with a certain disease/phenotype under study. A major limitation with the univariate analysis is that it may not make use of the information of multiple correlated phenotypes, which are usually measured and collected in practical studies. The multivariate analysis has proven to be a powerful approach in linkage studies of complex diseases/traits, but it has received little attention in GWA. In this study, we aim to develop a bivariate analytical method for GWA study, which can be used for a complex situation in which continuous trait and a binary trait are measured under study. Based on the modified extended generalized estimating equation (EGEE) method we proposed herein, we assessed the performance of our bivariate analyses through extensive simulations as well as real data analyses. In the study, to develop an EGEE approach for bivariate genetic analyses, we combined two different generalized linear models corresponding to phenotypic variables using a seemingly unrelated regression model. The simulation results demonstrated that our EGEE‐based bivariate analytical method outperforms univariate analyses in increasing statistical power under a variety of simulation scenarios. Notably, EGEE‐based bivariate analyses have consistent advantages over univariate analyses whether or not there exists a phenotypic correlation between the two traits. Our study has practical importance, as one can always use multivariate analyses as a screening tool when multiple phenotypes are available, without extra costs of statistical power and false‐positive rate. Analyses on empirical GWA data further affirm the advantages of our bivariate analytical method. Genet. Epidemiol. 2009. © 2008 Wiley‐Liss, Inc.

[1]  K Y Liang,et al.  An overview of methods for the analysis of longitudinal data. , 1992, Statistics in medicine.

[2]  Generalized estimating equations: A hybrid approach for mean parameters in multivariate regression models , 2002 .

[3]  David M. Evans,et al.  The power of multivariate quantitative-trait loci linkage analysis is influenced by the correlation between variables. , 2002, American journal of human genetics.

[4]  Christoph Lange,et al.  A multivariate family-based association test using generalized estimating equations: FBAT-GEE. , 2003, Biostatistics.

[5]  Peter H. Westfall,et al.  Testing Association of Statistically Inferred Haplotypes with Discrete and Continuous Traits in Samples of Unrelated Individuals , 2002, Human Heredity.

[6]  Tim L Radak,et al.  Caloric restriction and calcium's effect on bone metabolism and body composition in overweight and obese premenopausal women. , 2004, Nutrition reviews.

[7]  J C Whittaker,et al.  Mapping quantitative trait Loci using generalized estimating equations. , 2001, Genetics.

[8]  Claudio J. Verzilli,et al.  Bayesian modelling of multivariate quantitative traits using seemingly unrelated regressions , 2005, Genetic epidemiology.

[9]  H. Deng,et al.  Bayesian mapping of quantitative trait loci for multiple complex traits with the use of variance components. , 2007, American journal of human genetics.

[10]  Daniel B. Hall,et al.  On the application of extended quasi‐likelihood to the clustered data case , 2001 .

[11]  Z B Zeng,et al.  Multiple trait analysis of genetic mapping for quantitative trait loci. , 1995, Genetics.

[12]  D. Nyholt A simple correction for multiple testing for single-nucleotide polymorphisms in linkage disequilibrium with each other. , 2004, American journal of human genetics.

[13]  A. Agresti,et al.  A Correlated Probit Model for Joint Modeling of Clustered Binary and Continuous Responses , 2001 .

[14]  N Risch,et al.  The Future of Genetic Studies of Complex Human Diseases , 1996, Science.

[15]  Christopher Zorn Generalized Estimating Equation Models for Correlated Data: A Review with Applications , 2001 .

[16]  D B Allison,et al.  Multiple phenotype modeling in gene-mapping studies of quantitative traits: power advantages. , 1998, American journal of human genetics.

[17]  A. Zellner An Efficient Method of Estimating Seemingly Unrelated Regressions and Tests for Aggregation Bias , 1962 .

[18]  M. Xiong,et al.  Haplotypes vs single marker linkage disequilibrium tests: what do we gain? , 2001, European Journal of Human Genetics.

[19]  J Rochon,et al.  Analyzing bivariate repeated measures for discrete and continuous outcome variables. , 1996, Biometrics.

[20]  E. Silverman,et al.  Case-Control Association Studies in Pharmacogenetics , 2001, The Pharmacogenomics Journal.

[21]  R Kucherlapati,et al.  Converging evidence for a pseudoautosomal cytokine receptor gene locus in schizophrenia , 2007, Molecular Psychiatry.

[22]  L. Zhao,et al.  Correlated binary regression using a quadratic exponential model , 1990 .

[23]  S. Zeger,et al.  Longitudinal data analysis using generalized linear models , 1986 .

[24]  J. Nelder,et al.  An extended quasi-likelihood function , 1987 .

[25]  G. Kisakol,et al.  Effect of Weight Loss on Bone Metabolism: Comparison of Vertical Banded Gastroplasty and Medical Intervention , 2003, Obesity surgery.

[26]  C. Kastner,et al.  The Generalised Estimating Equations: An Annotated Bibliography , 1998 .

[27]  R V Gueorguieva,et al.  Joint analysis of repeatedly observed continuous and ordinal measures of disease severity , 2006, Statistics in medicine.

[28]  Jian Huang,et al.  Genetic linkage analysis of a dichotomous trait incorporating a tightly linked quantitative trait in affected sib pairs. , 2003, American journal of human genetics.

[29]  D. Reich,et al.  Principal components analysis corrects for stratification in genome-wide association studies , 2006, Nature Genetics.

[30]  P. Donnelly,et al.  Association mapping in structured populations. , 2000, American journal of human genetics.

[31]  P J Catalano,et al.  Bivariate modelling of clustered continuous and ordered categorical outcomes. , 1997, Statistics in medicine.

[32]  G. Abecasis,et al.  Family-based association tests for genomewide association scans. , 2007, American journal of human genetics.

[33]  L. Zhao,et al.  Estimating equations for parameters in means and covariances of multivariate discrete and continuous responses. , 1991, Biometrics.

[34]  Hong-Wen Deng,et al.  Incorporating Single-Locus Tests into Haplotype Cladistic Analysis in Case-Control Studies , 2007, PLoS genetics.

[35]  H. Deng,et al.  Correlation of Obesity and Osteoporosis: Effect of Fat Mass on the Determination of Osteoporosis , 2007, Journal of bone and mineral research : the official journal of the American Society for Bone and Mineral Research.

[36]  Wei-Min Chen,et al.  QTL fine mapping by measuring and testing for Hardy-Weinberg and linkage disequilibrium at a series of linked marker loci in extreme samples of populations. , 2000, American journal of human genetics.

[37]  S. Zeger,et al.  Multivariate Regression Analyses for Categorical Data , 1992 .

[38]  P. van Eerdewegh,et al.  Joint multipoint linkage analysis of multivariate qualitative and quantitative traits. I. Likelihood formulation and simulation results. , 1999, American journal of human genetics.

[39]  Thomas A. Severini,et al.  Extended Generalized Estimating Equations for Clustered Data , 1998 .

[40]  P. Catalano,et al.  Regression Models and Risk Estimation for Mixed Discrete and Continuous Outcomes in Developmental Toxicology , 2000, Risk analysis : an official publication of the Society for Risk Analysis.

[41]  R. Recker,et al.  Relationship of obesity with osteoporosis. , 2007, The Journal of clinical endocrinology and metabolism.

[42]  Larry Wasserman,et al.  Using linkage genome scans to improve power of association in genome scans. , 2006, American journal of human genetics.