Was There a Riverside Miracle? A Hierarchical Framework for Evaluating Programs With Grouped Data

This article discusses the evaluation of programs implemented at multiple sites. Two frequently used methods are pooling the data or using fixed effects (an extreme version of which estimates separate models for each site). The former approach ignores site effects. The latter incorporates site effects but lacks a framework for predicting the impact of subsequent implementations of the program (e.g., would a new implementation resemble Riverside?). I present a hierarchical model that lies between these two extremes. Using data from the Greater Avenues for Independence demonstration, I demonstrate that the model captures much of the site-to-site variation of the treatment effects but has less uncertainty than estimating the treatment effect separately for each site. I also show that when predictive uncertainty is ignored, the treatment impact for the Riverside sites is significant, but when predictive uncertainty is considered, the impact for these sites is insignificant. Finally, I demonstrate that the model extrapolates site effects with reasonable accuracy when the site being predicted does not differ substantially from the sites already observed. For example, the San Diego treatment effects could have been predicted based on their site characteristics, but the Riverside effects are consistently underpredicted.

[1]  Sergei S. Soares,et al.  USE OF SURVEY DESIGN FOR THE EVALUATION OF SOCIAL PROGRAMS: THE PNAD AND PETI , 2004 .

[2]  Rajeev Dehejia,et al.  Propensity Score-Matching Methods for Nonexperimental Causal Studies , 2002, Review of Economics and Statistics.

[3]  Rajeev H. Dehejiaa,et al.  Program evaluation as a decision problem , 2002 .

[4]  John Geweke,et al.  An empirical analysis of earnings dynamics among men in the PSID: 1968-1989 , 2000 .

[5]  Rajeev Dehejia,et al.  Program Evaluation as a Decision Problem , 1999 .

[6]  Petra E. Todd,et al.  Matching As An Econometric Evaluation Estimator , 1998 .

[7]  Petra E. Todd,et al.  Matching As An Econometric Evaluation Estimator: Evidence from Evaluating a Job Training Programme , 1997 .

[8]  Jeffrey A. Smith,et al.  The Sensitivity of Experimental Impact Estimates: Evidence from the National Jtpa Study , 1997 .

[9]  Gary Chamberlain,et al.  Hierarchical Bayes Models with Many Instrumental Variables , 1996 .

[10]  Siddhartha Chib,et al.  Markov Chain Monte Carlo Simulation Methods in Econometrics , 1996, Econometric Theory.

[11]  David B. Dunson,et al.  Bayesian Data Analysis , 2010 .

[12]  L. Hedges,et al.  The Handbook of Research Synthesis , 1995 .

[13]  Peter E. Rossi,et al.  Hierarchical Modelling of Consumer Heterogeneity: An Application to Target Marketing , 1995 .

[14]  J. Riccio GAIN: Benefits, Costs, and Three-Year Impacts of a Welfare-to-Work Program. California's Greater Avenues for Independence Program. , 1994 .

[15]  R. Hanka The Handbook of Research Synthesis , 1994 .

[16]  S. Chib,et al.  Bayesian analysis of binary and polychotomous response data , 1993 .

[17]  David Card,et al.  Does School Quality Matter? Returns to Education and the Characteristics of Public Schools in the United States , 1990, Journal of Political Economy.

[18]  S. Chib Bayes inference in the Tobit censored regression model , 1992 .

[19]  Adrian F. M. Smith,et al.  Sampling-Based Approaches to Calculating Marginal Densities , 1990 .

[20]  C. N. Morris,et al.  The calculation of posterior distributions by data augmentation , 1987 .

[21]  D. Rubin,et al.  Constructing a Control Group Using Multivariate Matched Sampling Methods That Incorporate the Propensity Score , 1985 .

[22]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .