Avoiding randomization failure in program evaluation, with application to the Medicare Health Support program.

We highlight common problems in the application of random treatment assignment in large-scale program evaluation. Random assignment is the defining feature of modern experimental design, yet errors in design, implementation, and analysis often result in real-world applications not benefiting from its advantages. The errors discussed here cover the control of variability, levels of randomization, size of treatment arms, and power to detect causal effects, as well as the many problems that commonly lead to post-treatment bias. We illustrate these issues by identifying numerous serious errors in the Medicare Health Support evaluation and offering recommendations to improve the design and analysis of this and other large-scale randomized experiments.

[1]  S. Foote Next steps: how can Medicare accelerate the pace of improving chronic care? , 2009, Health affairs.

[2]  M. Phil,et al.  Evaluation of Phase I of the Medicare Health Support Pilot Program Under Traditional Fee-for-Service Medicare: 18-Month Interim Analysis , 2008 .

[3]  Gary King,et al.  Matching as Nonparametric Preprocessing for Reducing Model Dependence in Parametric Causal Inference , 2007, Political Analysis.

[4]  J. Cromwell,et al.  Evaluation of Medicare Health Support Chronic Disease Pilot Program , 2008, Health care financing review.

[5]  J Cornfield,et al.  Randomization by group: a formal analysis. , 1978, American journal of epidemiology.

[6]  Katya Galactionova,et al.  Chronic conditions account for rise in Medicare spending from 1987 to 2006. , 2010, Health affairs.

[7]  Gary King,et al.  The essential role of pair matching in cluster-randomized experiments, with application to the Mexican Universal Health Insurance Evaluation , 2009, 0910.3752.

[8]  Andrew Gelman,et al.  Treatment Effects in Before‐After Data , 2005 .

[9]  G. King,et al.  Analyzing Incomplete Political Science Data: An Alternative Algorithm for Multiple Imputation , 2001, American Political Science Review.

[10]  Andrew Gelman,et al.  Applied Bayesian Modeling And Causal Inference From Incomplete-Data Perspectives , 2005 .

[11]  Jerome Cornfield,et al.  SYMPOSIUM ON CHD PREVENTION TRIALS: DESIGN ISSUES IN TESTING LIFE STYLE INTERVENTIONRANDOMIZATION BY GROUP: A FORMAL ANALYSIS , 1978 .

[12]  Stefano M. Iacus,et al.  CEM: Coarsened Exact Matching Software , 2009 .

[13]  D. Rubin BIAS REDUCTION USING MAHALANOBIS METRIC MATCHING , 1978 .

[14]  S. Iacus,et al.  CEM: Stata module to perform Coarsened Exact Matching , 2010 .

[15]  Sidney Addelman,et al.  trans-Dimethanolbis(1,1,1-trifluoro-5,5-dimethylhexane-2,4-dionato)zinc(II) , 2008, Acta crystallographica. Section E, Structure reports online.

[16]  Gary King,et al.  A Politically Robust Experimental Design for Public Policy Evaluation, With Application to the Mexican Universal Health Insurance Program , 2007, Journal of policy analysis and management : [the journal of the Association for Public Policy Analysis and Management].

[17]  Gary King,et al.  The Dangers of Extreme Counterfactuals , 2006, Political Analysis.

[18]  Ryan T. Moore,et al.  Public policy for the poor? A randomised assessment of the Mexican universal health insurance programme , 2009, The Lancet.

[19]  G. King,et al.  Causal Inference without Balance Checking: Coarsened Exact Matching , 2012, Political Analysis.