Between–within models for survival analysis

A popular way to control for confounding in observational studies is to identify clusters of individuals (e.g., twin pairs), such that a large set of potential confounders are constant (shared) within each cluster. By studying the exposure-outcome association within clusters, we are in effect controlling for the whole set of shared confounders. An increasingly popular analysis tool is the between-within (BW) model, which decomposes the exposure-outcome association into a 'within-cluster effect' and a 'between-cluster effect'. BW models are relatively common for nonsurvival outcomes and have been studied in the theoretical literature. Although it is straightforward to use BW models for survival outcomes, this has rarely been carried out in practice, and such models have not been studied in the theoretical literature. In this paper, we propose a gamma BW model for survival outcomes. We compare the properties of this model with the more standard stratified Cox regression model and use the proposed model to analyze data from a twin study of obesity and mortality. We find the following: (i) the gamma BW model often produces a more powerful test of the 'within-cluster effect' than stratified Cox regression; and (ii) the gamma BW model is robust against model misspecification, although there are situations where it could give biased estimates.

[1]  Ezra Susser,et al.  Commentary : Advent of sibling designs , 2011 .

[2]  Charles E. McCulloch,et al.  Separating between‐ and within‐cluster covariate effects by using conditional and partitioning methods , 2006 .

[3]  J. D. Holt,et al.  Survival analyses in twin studies and matched pair experiments , 1974 .

[4]  Norman E. Breslow,et al.  Odds ratio estimators when the data are sparse , 1981 .

[5]  Y. Pawitan,et al.  Analysis of 1:1 Matched Cohort Studies and Twin Studies, with Binary Exposures and Binary Outcomes , 2012, 1210.0767.

[6]  J. Kalbfleisch,et al.  Between- and within-cluster covariate effects in the analysis of clustered data. , 1998, Biometrics.

[7]  J P Klein,et al.  Semiparametric estimation of random effects using the Cox model based on the EM algorithm. , 1992, Biometrics.

[8]  P. Grambsch,et al.  Proportional hazards tests and diagnostics based on weighted residuals , 1994 .

[9]  Zhulin He,et al.  Adjusting for confounding by cluster using generalized linear mixed models , 2010 .

[10]  P. Allison Fixed Effects Regression Models , 2009 .

[11]  Babette A Brumback,et al.  Adjusting for confounding by neighborhood using generalized linear mixed models and complex survey data , 2013, Statistics in medicine.

[12]  Kaare Christensen,et al.  Causal Inference and Observational Research , 2010, Perspectives on psychological science : a journal of the Association for Psychological Science.

[13]  M. Gorfine,et al.  On robustness of marginal regression coefficient estimates and hazard functions in multivariate survival analysis of family data when the frailty distribution is mis‐specified , 2007, Statistics in medicine.

[14]  S. Vansteelandt,et al.  Conditional Generalized Estimating Equations for the Analysis of Clustered and Longitudinal Data , 2008, Biometrics.

[15]  K. Michaëlsson,et al.  Body Mass Index and Mortality: Is the Association Explained by Genetic Factors? , 2011, Epidemiology.

[16]  Philip Hougaard,et al.  Analysis of Multivariate Survival Data , 2001 .

[17]  B. D’Onofrio,et al.  All in the Family: Comparing Siblings to Test Causal Hypotheses Regarding Environmental Influences on Behavior , 2010, Current directions in psychological science.

[18]  P. Lichtenstein,et al.  Birth Weight-Breast Cancer Revisited: Is the Association Confounded by Familial Factors? , 2009, Cancer Epidemiology, Biomarkers & Prevention.

[19]  M. Fischer,et al.  Co‐twin Control Methods , 2005 .

[20]  David V Glidden,et al.  Modelling clustered survival data from multicentre clinical trials , 2004, Statistics in medicine.