Estimating the Effect of a Community-Based Intervention with Two Communities

Abstract Due to the need to evaluate the effectiveness of community-based programs in practice, there is substantial interest in methods to estimate the causal effects of community-level treatments or exposures on individual level outcomes. The challenge one is confronted with is that different communities have different environmental factors affecting the individual outcomes, and all individuals in a community share the same environment and intervention. In practice, data are often available from only a small number of communities, making it difficult if not impossible to adjust for these environmental confounders. In this paper we consider an extreme version of this dilemma, in which two communities each receives a different level of the intervention, and covariates and outcomes are measured on a random sample of independent individuals from each of the two populations; the results presented can be straightforwardly generalized to settings in which more than two communities are sampled. We address the question of what conditions are needed to estimate the causal effect of the intervention, defined in terms of an ideal experiment in which the exposed level of the intervention is assigned to both communities and individual outcomes are measured in the combined population, and then the clock is turned back and a control level of the intervention is assigned to both communities and individual outcomes are measured in the combined population. We refer to the difference in the expectation of these outcomes as the marginal (overall) treatment effect. We also discuss conditions needed for estimation of the treatment effect on the treated community. We apply a nonparametric structural equation model to define these causal effects and to establish conditions under which they are identified. These identifiability conditions provide guidance for the design of studies to investigate community level causal effects and for assessing the validity of causal interpretations when data are only available from a few communities. When the identifiability conditions fail to hold, the proposed statistical parameters still provide nonparametric treatment effect measures (albeit non-causal) whose statistical interpretations do not depend on model specifications. In addition, we study the use of a matched cohort sampling design in which the units of different communities are matched on individual factors. Finally, we provide semiparametric efficient and doubly robust targeted MLE estimators of the community level causal effect based on i.i.d. sampling and matched cohort sampling.

[1]  Bryan S. Graham,et al.  Identifying Social Interactions Through Conditional Variance Restrictions , 2008 .

[2]  Mark J. van der Laan,et al.  Super Learner In Prediction , 2010 .

[3]  Tyler J VanderWeele,et al.  On causal inference in the presence of interference , 2012, Statistical methods in medical research.

[4]  D M Murray,et al.  An Evaluation of Analysis Options for the One-Group-Per-Condition Design , 2001, Evaluation review.

[5]  W. Pan,et al.  Small‐sample adjustments in using the sandwich variance estimator in generalized estimating equations , 2002, Statistics in medicine.

[6]  J. M. Oakes,et al.  The (mis)estimation of neighborhood effects: causal inference for a practicable social epidemiology. , 2004, Social science & medicine.

[7]  Gary King,et al.  The essential role of pair matching in cluster-randomized experiments, with application to the Mexican Universal Health Insurance Evaluation , 2009, 0910.3752.

[8]  Donald P. Green,et al.  Detecting Spillover Effects: Design and Analysis of Multilevel Experiments , 2012 .

[9]  Dylan S. Small,et al.  Comment: The Essential Role of Pair Matching in Cluster-Randomized Experiments, with Application to the Mexican Universal Health Insurance Evaluation , 2009 .

[10]  Jessica B. Janega,et al.  Design and analysis of group-randomized trials: a review of recent practices. , 2004, American journal of public health.

[11]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .

[12]  Michail Prodan,et al.  CHAPTER 17 – THE PLANNING OF EXPERIMENTS , 1968 .

[13]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[14]  C. Manski Identification of Endogenous Social Effects: The Reflection Problem , 1993 .

[15]  S. Raudenbush,et al.  Evaluating Kindergarten Retention Policy , 2006 .

[16]  A. Atienza,et al.  Community-based health intervention trials: an overview of methodological issues. , 2002, Epidemiologic reviews.

[17]  Michael E. Sobel,et al.  What Do Randomized Studies of Housing Mobility Demonstrate? , 2006 .

[18]  P. Holland Statistics and Causal Inference , 1985 .

[19]  Michael G Kenward,et al.  The analysis of very small samples of repeated measurements II: A modified Box correction , 2010, Statistics in medicine.

[20]  Jake Bowers,et al.  Attributing Effects to a Cluster-Randomized Get-Out-the-Vote Campaign , 2009 .

[21]  Tyler J. VanderWeele,et al.  Bounding the Infectiousness Effect in Vaccine Trials , 2011, Epidemiology.

[22]  Sherri Rose,et al.  The International Journal of Biostatistics Why Match ? Investigating Matched Case-Control Study Designs with Causal Effect Estimation , 2011 .

[23]  Donald P. Green,et al.  Detecting Spillover in Social Networks: Design and Analysis of Multilevel Experiments , 2008 .

[24]  Cosma Rohilla Shalizi,et al.  Homophily and Contagion Are Generically Confounded in Observational Social Network Studies , 2010, Sociological methods & research.

[25]  R. Pearl Biometrics , 1914, The American Naturalist.

[26]  Mark J van der Laan,et al.  The International Journal of Biostatistics A Targeted Maximum Likelihood Estimator of a Causal Effect on a Bounded Continuous Outcome , 2011 .

[27]  T. VanderWeele,et al.  Effect partitioning under interference in two-stage randomized vaccine trials. , 2011, Statistics & probability letters.

[28]  P. Rosenbaum Interference Between Units in Randomized Experiments , 2007 .

[29]  Dylan S. Small,et al.  Randomization Inference in a Group–Randomized Trial of Treatments for Depression , 2008 .

[30]  M. Hudgens,et al.  Toward Causal Inference With Interference , 2008, Journal of the American Statistical Association.

[31]  M. J. van der Laan,et al.  The International Journal of Biostatistics Targeted Maximum Likelihood Learning , 2011 .

[32]  Michael G Kenward,et al.  The analysis of very small samples of repeated measurements I: An adjusted sandwich estimator , 2010, Statistics in medicine.

[33]  Mark J van der Laan,et al.  Estimation Based on Case-Control Designs with Known Prevalence Probability , 2008, The international journal of biostatistics.

[34]  M. Halloran,et al.  Causal Inference in Infectious Diseases , 1995, Epidemiology.

[35]  Mark J. van der Laan,et al.  Estimation of Causal Effects of Community Based Interventions , 2010 .

[36]  I NICOLETTI,et al.  The Planning of Experiments , 1936, Rivista di clinica pediatrica.

[37]  Mark J van der Laan,et al.  An Application of Collaborative Targeted Maximum Likelihood Estimation in Causal Inference and Genomics , 2010, The international journal of biostatistics.

[38]  M. J. van der Laan,et al.  Simple Optimal Weighting of Cases and Controls in Case-Control Studies , 2008, The international journal of biostatistics.

[39]  Geert Ridder,et al.  Measuring the Effects of Segregation in the Presence of Social Spillovers: A Nonparametric Approach , 2010 .

[40]  Donald B. Rubin,et al.  Bayesian Inference for Causal Effects: The Role of Randomization , 1978 .