Robust joint analysis allowing for model uncertainty in two-stage genetic association studies

BackgroundThe cost efficient two-stage design is often used in genome-wide association studies (GWASs) in searching for genetic loci underlying the susceptibility for complex diseases. Replication-based analysis, which considers data from each stage separately, often suffers from loss of efficiency. Joint test that combines data from both stages has been proposed and widely used to improve efficiency. However, existing joint analyses are based on test statistics derived under an assumed genetic model, and thus might not have robust performance when the assumed genetic model is not appropriate.ResultsIn this paper, we propose joint analyses based on two robust tests, MERT and MAX3, for GWASs under a two-stage design. We developed computationally efficient procedures and formulas for significant level evaluation and power calculation. The performances of the proposed approaches are investigated through the extensive simulation studies and a real example. Numerical results show that the joint analysis based on the MAX3 test statistic has the best overall performance.ConclusionsMAX3 joint analysis is the most robust procedure among the considered joint analyses, and we recommend using it in a two-stage genome-wide association study.

[1]  I. Pe’er,et al.  Optimal two‐stage genotyping designs for genome‐wide association scans , 2006, Genetic epidemiology.

[2]  G. Zheng,et al.  Robust Tests for Single‐marker Analysis in Case‐Control Genetic Association Studies , 2009, Annals of human genetics.

[3]  Jurg Ott,et al.  Handbook of Human Genetic Linkage , 1994 .

[4]  Gang Zheng,et al.  Genetic model selection in two-phase analysis for case-control association studies. , 2008, Biostatistics.

[5]  P. Sasieni From genotypes to genes: doubling the sample size. , 1997, Biometrics.

[6]  D. Thomas,et al.  Two‐Stage sampling designs for gene association studies , 2004, Genetic epidemiology.

[7]  Joseph L. Gastwirth,et al.  Comparison of robust tests for genetic association using case-control studies , 2006, math/0611179.

[8]  Alan,et al.  Comparison of Methods for the Computationof Multivariate Normal Probabilities , 1993 .

[9]  D J Schaid,et al.  Genotype relative risks: methods for design and analysis of candidate-gene association studies. , 1993, American journal of human genetics.

[10]  P. Fearnhead,et al.  Genome-wide association study of prostate cancer identifies a second risk locus at 8q24 , 2007, Nature Genetics.

[11]  Wentian Li,et al.  Comparison of two‐phase analyses for case–control genetic association studies , 2008, Statistics in medicine.

[12]  A. Genz Numerical Computation of Multivariate Normal Probabilities , 1992 .

[13]  G. Abecasis,et al.  Optimal designs for two‐stage genome‐wide association studies , 2007, Genetic epidemiology.

[14]  Min-Jeong Kwak,et al.  A Robust Test for Two‐Stage Design in Genome‐Wide Association Studies , 2009, Biometrics.

[15]  Frank Bretz,et al.  Comparison of Methods for the Computation of Multivariate t Probabilities , 2002 .

[16]  Yijun Zuo,et al.  Two-Stage Designs in Case–Control Association Analysis , 2006, Genetics.

[17]  G. Abecasis,et al.  Joint analysis is more efficient than replication-based analysis for two-stage genome-wide association studies , 2006, Nature Genetics.

[18]  T. Hudson,et al.  A genome-wide association study identifies novel risk loci for type 2 diabetes , 2007, Nature.

[19]  R. Elston,et al.  Optimal two‐stage genotyping in population‐based association studies , 2003, Genetic epidemiology.

[20]  Qizhai Li,et al.  Flexible design for following up positive findings. , 2007, American journal of human genetics.

[21]  Peter Bauer,et al.  Two-stage designs applying methods differing in costs , 2007, Bioinform..

[22]  H. Schäfer,et al.  Including sampling and phenotyping costs into the optimization of two stage designs for genome wide association studies , 2007, Genetic epidemiology.

[23]  C. Begg,et al.  Two‐Stage Designs for Gene–Disease Association Studies with Sample Size Constraints , 2004, Biometrics.

[24]  Joseph L. Gastwirth,et al.  The Use of Maximin Efficiency Robust Tests in Combining Contingency Tables and Survival Analysis , 1985 .

[25]  R. Elston,et al.  A powerful method of combining measures of association and Hardy–Weinberg disequilibrium for fine‐mapping in case‐control studies , 2006, Statistics in medicine.

[26]  Gang Zheng,et al.  On estimation of the variance in Cochran–Armitage trend tests for genetic association using case–control studies , 2006, Statistics in medicine.

[27]  Y. L. Tong The multivariate normal distribution , 1989 .