Adjusting for confounding by neighborhood using complex survey data

Recently, we examined methods of adjusting for confounding by neighborhood of an individual exposure effect on a binary outcome, using complex survey data; the methods were found to fail when the neighborhood sample sizes are small and the selection bias is strongly informative. More recently, other authors have adapted an older method from the genetics literature for application to complex survey data; their adaptation achieves a consistent estimator under a broad range of circumstances. The method is based on weighted pseudolikelihoods, in which the contribution from each neighborhood involves all pairs of cases and controls in the neighborhood. The pairs are treated as if they were independent, a pairwise pseudo-conditional likelihood is thus derived, and then the corresponding score equation is weighted with inverse-probabilities of sampling each case-control pair. We have greatly simplified the implementation by translating the pairwise pseudo-conditional likelihood into an equivalent ordinary weighted log-likelihood formulation. We show how to program the method using standard software for ordinary logistic regression with complex survey data (e.g. SAS PROC SURVEYLOGISTIC). We also show that the methodology applies to a broader set of sampling scenarios than the ones considered by the previous authors. We demonstrate the validity of our simplified implementation by applying it to a simulation for which previous methods failed; the new method performs beautifully. We also apply the new method to an analysis of 2009 National Health Interview Survey (NHIS) public-use data, to estimate the effect of education on health insurance coverage, adjusting for confounding by neighborhood.