Bias correction to secondary trait analysis with case–control design

In genetic association studies with densely typed genetic markers, it is often of substantial interest to examine not only the primary phenotype but also the secondary traits for their association with the genetic markers. For more efficient sample ascertainment of the primary phenotype, a case–control design or its variants, such as the extreme‐value sampling design for a quantitative trait, are often adopted. The secondary trait analysis without correcting for the sample ascertainment may yield a biased association estimator. We propose a new method aiming at correcting the potential bias due to the inadequate adjustment of the sample ascertainment. The method yields explicit correction formulas that can be used to both screen the genetic markers and rapidly evaluate the sensitivity of the results to the assumed baseline case‐prevalence rate in the population. Simulation studies demonstrate good performance of the proposed approach in comparison with the more computationally intensive approaches, such as the compensator approaches and the maximum prospective likelihood approach. We illustrate the application of the approach by analysis of the genetic association of prostate specific antigen in a case–control study of prostate cancer in the African American population. Copyright © 2012 John Wiley & Sons, Ltd.

[1]  J. Anderson Separate sample logistic discrimination , 1972 .

[2]  R. Pyke,et al.  Logistic disease incidence models and case-control studies , 1979 .

[3]  N. Breslow,et al.  Statistical methods in cancer research: volume 1- The analysis of case-control studies , 1980 .

[4]  N. Nagelkerke,et al.  Logistic regression in case-control studies: the effect of using independent as dependent variables. , 1995, Statistics in medicine.

[5]  N. Breslow,et al.  Statistics in Epidemiology : The Case-Control Study , 2008 .

[6]  K. Roeder,et al.  A Semiparametric Mixture Approach to Case-Control Studies with Errors in Covariables , 1996 .

[7]  Deniel Rabinowitz A note on efficient estimation from case-control data , 1997 .

[8]  A. Scott,et al.  Re-using data from case-control studies. , 1997, Statistics in medicine.

[9]  A. Scott,et al.  Fitting regression models to case-control data by maximum likelihood , 1997 .

[10]  K. Roeder,et al.  Genomic Control for Association Studies , 1999, Biometrics.

[11]  H. Y. Chen A note on the prospective analysis of outcome‐dependent samples , 2003 .

[12]  J. Crowley,et al.  Prevalence of prostate cancer among men with a prostate-specific antigen level < or =4.0 ng per milliliter. , 2004, The New England journal of medicine.

[13]  Marie Reilly,et al.  Re‐use of case–control data for analysis of new outcome variables , 2005, Statistics in medicine.

[14]  D. Zeng,et al.  Likelihood-Based Inference on Haplotype Effects in Genetic Association Studies , 2006 .

[15]  Yannan Jiang,et al.  Secondary analysis of case‐control data , 2006, Statistics in medicine.

[16]  J. Klenk,et al.  Analyses of Case–Control Data for Additional Outcomes , 2007, Epidemiology.

[17]  P. Kraft Analyses of genome-wide association scans for additional outcomes. , 2007, Epidemiology.

[18]  Hua Yun Chen A Semiparametric Odds Ratio Model for Measuring Association , 2007, Biometrics.

[19]  Ali Amin Al Olama,et al.  Multiple newly identified loci associated with prostate cancer susceptibility , 2008, Nature Genetics.

[20]  Ali Amin Al Olama,et al.  Identification of seven new prostate cancer susceptibility loci through a genome-wide association study , 2009, Nature Genetics.

[21]  P. Kraft,et al.  Genome‐wide association scans for secondary traits using case‐control samples , 2009, Genetic epidemiology.

[22]  D. Zeng,et al.  Proper analysis of secondary phenotype data in case‐control association studies , 2009, Genetic epidemiology.

[23]  M. Gail,et al.  Using cases to strengthen inference on the association between single nucleotide polymorphisms and a secondary phenotype in genome‐wide association studies , 2010, Genetic epidemiology.

[24]  H. Y. Chen Compatibility of conditionally specified models. , 2010, Statistics & probability letters.

[25]  H. Y. Chen A unified framework for studying parameter identifiability and estimation in biased sampling designs , 2011 .

[26]  Jian Wang,et al.  Estimation of odds ratios of genetic variants for the secondary phenotypes associated with primary diseases , 2011, Genetic epidemiology.