Logistic regression of family data from retrospective study designs

We wish to study the effects of genetic and environmental factors on disease risk, using data from families ascertained because they contain multiple cases of the disease. To do so, we must account for the way participants were ascertained, and for within‐family correlations in both disease occurrences and covariates. We model the joint probability distribution of the covariates of ascertained family members, given family disease occurrence and pedigree structure. We describe two such covariate models: the random effects model and the marginal model. Both models assume a logistic form for the distribution of one person's covariates that involves a vector β of regression parameters. The components of β in the two models have different interpretations, and they differ in magnitude when the covariates are correlated within families. We describe ascertainment assumptions needed to estimate consistently the parameters βRE in the random effects model and the parameters βM in the marginal model. Under the ascertainment assumptions for the random effects model, we show that conditional logistic regression (CLR) of matched family data gives a consistent estimate ${\hat{\beta }}_{RE} $ for βRE and a consistent estimate for the covariance matrix of ${\hat{\beta }}_{RE} $. Under the ascertainment assumptions for the marginal model, we show that unconditional logistic regression (ULR) gives a consistent estimate for βM, and we give a consistent estimator for its covariance matrix. The random effects/CLR approach is simple to use and to interpret, but it can use data only from families containing both affected and unaffected members. The marginal/ULR approach uses data from all individuals, but its variance estimates require special computations. A C program to compute these variance estimates is available at http://www.stanford.edu/dept/HRP/epidemiology. We illustrate these pros and cons by application to data on the effects of parity on ovarian cancer risk in mother/daughter pairs, and use simulations to study the performance of the estimates. Genet Epidemiol 25:177–189, 2003. © 2003 Wiley‐Liss, Inc.

[1]  J. Anderson Separate sample logistic discrimination , 1972 .

[2]  D. Thomas,et al.  Bias and efficiency in family-based gene-characterization studies: conditional, prospective, retrospective, and joint likelihoods. , 2000, American journal of human genetics.

[3]  S. J. Press,et al.  Choosing between Logistic Regression and Discriminant Analysis , 1978 .

[4]  B. Efron The Efficiency of Logistic Regression Compared to Normal Discriminant Analysis , 1975 .

[5]  P. Diggle Analysis of Longitudinal Data , 1995 .

[6]  M. Kendall Theoretical Statistics , 1956, Nature.

[7]  S. Zeger,et al.  Longitudinal data analysis using generalized linear models , 1986 .

[8]  T. Szabo Hankel transform diffraction theory for circularly symmetric sources radiating into parabolically anisotropic (or isotropic) media , 1981 .

[9]  Alice S. Whittemore,et al.  Logistic regression of family data from case-control studies , 1995 .

[10]  Murray B. Sachs,et al.  Discrimination of steady‐state vowels by blackbirds and pigeons , 1981 .

[11]  D. Thomas,et al.  Ascertainment Bias in Rate Ratio Estimation from Case‐Sibling Control Studies of Variable Age‐At‐Onset Diseases , 1999, Biometrics.

[12]  R. Pyke,et al.  Logistic disease incidence models and case-control studies , 1979 .

[13]  N. Breslow,et al.  The analysis of case-control studies , 1980 .

[14]  N. Breslow,et al.  Statistical methods in cancer research: volume 1- The analysis of case-control studies , 1980 .

[15]  Filemon Quiaoit,et al.  Combined association and aggregation analysis of data from case-control family studies , 1998 .

[16]  J. Cornfield,et al.  A multivariate analysis of the risk of coronary heart disease in Framingham. , 1967, Journal of chronic diseases.

[17]  D J Schaid,et al.  Evaluation of candidate genes in case-control studies: a statistical method to account for related subjects. , 2001, American journal of human genetics.

[18]  D. Krewski,et al.  Statistical Methods in Cancer Research: Volume III: The Design and Analysis of Long-Term Animal Experiments , 1987 .

[19]  N. E. Breslow Statistical Methods in Cancer Research , 1986 .

[20]  B. Langholz,et al.  Ascertainment bias in family-based case-control studies. , 2002, American journal of epidemiology.