Recently, we examined methods of adjusting for confounding by neighborhood of an individual exposure effect on a binary outcome, using complex survey data; the methods were found to fail when the neighborhood sample sizes are small and the selection bias is strongly informative. More recently, other authors have adapted an older method from the genetics literature for application to complex survey data; their adaptation achieves a consistent estimator under a broad range of circumstances. The method is based on weighted pseudolikelihoods, in which the contribution from each neighborhood involves all pairs of cases and controls in the neighborhood. The pairs are treated as if they were independent, a pairwise pseudo-conditional likelihood is thus derived, and then the corresponding score equation is weighted with inverse-probabilities of sampling each case-control pair. We have greatly simplified the implementation by translating the pairwise pseudo-conditional likelihood into an equivalent ordinary weighted log-likelihood formulation. We show how to program the method using standard software for ordinary logistic regression with complex survey data (e.g. SAS PROC SURVEYLOGISTIC). We also show that the methodology applies to a broader set of sampling scenarios than the ones considered by the previous authors. We demonstrate the validity of our simplified implementation by applying it to a simulation for which previous methods failed; the new method performs beautifully. We also apply the new method to an analysis of 2009 National Health Interview Survey (NHIS) public-use data, to estimate the effect of education on health insurance coverage, adjusting for confounding by neighborhood.
[1]
Zhulin He,et al.
Adjusting for confounding by cluster using generalized linear mixed models
,
2010
.
[2]
N. Breslow,et al.
Estimation of multiple relative risk functions in matched case-control studies.
,
1978,
American journal of epidemiology.
[3]
K Y Liang,et al.
Extended Mantel-Haenszel estimating procedure for multivariate logistic regression models.
,
1987,
Biometrics.
[4]
Babette A Brumback,et al.
Efforts to adjust for confounding by neighborhood using complex survey data
,
2010,
Statistics in medicine.
[5]
J. Kalbfleisch,et al.
Between- and within-cluster covariate effects in the analysis of clustered data.
,
1998,
Biometrics.
[6]
P L Remington,et al.
Design, characteristics, and usefulness of state-based behavioral risk factor surveillance: 1981-87.
,
1988,
Public health reports.
[7]
E. Korn,et al.
Analysis of Health Surveys: Korn/Analysis
,
1999
.
[8]
J. Pearl,et al.
Confounding and Collapsibility in Causal Inference
,
1999
.
[9]
S. Rabe-Hesketh,et al.
Multilevel modelling of complex survey data
,
2006
.
[10]
E. Korn,et al.
Conditional Logistic Regression With Survey Data
,
2011
.
[11]
Charles E. McCulloch,et al.
Separating between‐ and within‐cluster covariate effects by using conditional and partitioning methods
,
2006
.
[12]
A. Agresti.
Categorical data analysis
,
1993
.