Estimating the effect of treatment on binary outcomes using full matching on the propensity score

Many non-experimental studies use propensity-score methods to estimate causal effects by balancing treatment and control groups on a set of observed baseline covariates. Full matching on the propensity score has emerged as a particularly effective and flexible method for utilizing all available data, and creating well-balanced treatment and comparison groups. However, full matching has been used infrequently with binary outcomes, and relatively little work has investigated the performance of full matching when estimating effects on binary outcomes. This paper describes methods that can be used for estimating the effect of treatment on binary outcomes when using full matching. It then used Monte Carlo simulations to evaluate the performance of these methods based on full matching (with and without a caliper), and compared their performance with that of nearest neighbour matching (with and without a caliper) and inverse probability of treatment weighting. The simulations varied the prevalence of the treatment and the strength of association between the covariates and treatment assignment. Results indicated that all of the approaches work well when the strength of confounding is relatively weak. With stronger confounding, the relative performance of the methods varies, with nearest neighbour matching with a caliper showing consistently good performance across a wide range of settings. We illustrate the approaches using a study estimating the effect of inpatient smoking cessation counselling on survival following hospitalization for a heart attack.

[1]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .

[2]  D. Rubin,et al.  Reducing Bias in Observational Studies Using Subclassification on the Propensity Score , 1984 .

[3]  M. Gail,et al.  Biased estimates of treatment effect in randomized experiments with nonlinear regressions and omitted covariates , 1984 .

[4]  D. Rubin,et al.  Constructing a Control Group Using Multivariate Matched Sampling Methods That Incorporate the Propensity Score , 1985 .

[5]  S Greenland,et al.  Interpretation and choice of effect measures in epidemiologic analyses. , 1987, American journal of epidemiology.

[6]  P. Rosenbaum Model-Based Direct Adjustment , 1987 .

[7]  D L Sackett,et al.  An assessment of clinically useful measures of the consequences of treatment. , 1988, The New England journal of medicine.

[8]  P. Rosenbaum A Characterization of Optimal Designs for Observational Studies , 1991 .

[9]  Paul R. Rosenbaum,et al.  Comparison of Multivariate Matching Methods: Structures, Distances, and Algorithms , 1993 .

[10]  N Heddle,et al.  Basic statistics for clinicians: 3. Assessing the effects of treatment: measures of association. , 1995, CMAJ : Canadian Medical Association journal = journal de l'Association medicale canadienne.

[11]  D. Sackett,et al.  The number needed to treat: a clinically useful measure of treatment effect , 1995, BMJ.

[12]  Jonathan J. Deeks,et al.  Down with odds ratios! , 1996, Evidence Based Medicine.

[13]  T. Shakespeare,et al.  Observational Studies , 2003 .

[14]  P. Rosenbaum,et al.  Substantial Gains in Bias Reduction from Matching with a Variable Number of Controls , 2000, Biometrics.

[15]  Edna Schechtman,et al.  Odds ratio, relative risk, absolute risk reduction, and the number needed to treat--which of these should we use? , 2002, Value in health : the journal of the International Society for Pharmacoeconomics and Outcomes Research.

[16]  J. Lunceford,et al.  Stratification and weighting via the propensity score in estimation of causal treatment effects: a comparative study , 2004, Statistics in medicine.

[17]  B. Hansen Full Matching in an Observational Study of Coaching for the SAT , 2004 .

[18]  Harold I Feldman,et al.  Model Selection, Confounder Control, and Marginal Structural Models , 2004 .

[19]  P. Austin,et al.  Reader's guide to critical appraisal of cohort studies: 2. Assessing potential for confounding , 2005, BMJ : British Medical Journal.

[20]  B. Hansen,et al.  Optimal Full Matching and Related Designs via Network Flows , 2006 .

[21]  G. Imbens,et al.  On the Failure of the Bootstrap for Matching Estimators , 2006 .

[22]  Peter C Austin,et al.  Conditioning on the propensity score can result in biased estimation of common measures of treatment effect: a Monte Carlo study , 2007, Statistics in medicine.

[23]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .

[24]  Peter C Austin,et al.  The performance of different propensity score methods for estimating marginal odds ratios, Statistics in Medicine 2007; 26:3078–3094 , 2008 .

[25]  D. Rubin The design versus the analysis of observational studies for causal effects: parallels with the design of randomized trials , 2007, Statistics in medicine.

[26]  Gary King,et al.  Matching as Nonparametric Preprocessing for Reducing Model Dependence in Parametric Causal Inference , 2007, Political Analysis.

[27]  Peter C Austin,et al.  A critical appraisal of propensity‐score matching in the medical literature between 1996 and 2003 , 2008, Statistics in medicine.

[28]  Peter C Austin,et al.  The performance of different propensity-score methods for estimating relative risks. , 2008, Journal of clinical epidemiology.

[29]  Peter C Austin,et al.  Effectiveness of public report cards for improving the quality of cardiac care: the EFFECT study: a randomized trial. , 2009, JAMA.

[30]  P. Austin The International Journal of Biostatistics Type I Error Rates , Coverage of Confidence Intervals , and Variance Estimation in Propensity-Score Matched Analyses , 2011 .

[31]  Peter C Austin,et al.  The performance of different propensity-score methods for estimating differences in proportions (risk differences or absolute risk reductions) in observational studies , 2010, Statistics in medicine.

[32]  Andrea Manca,et al.  A substantial and confusing variation exists in handling of baseline covariates in randomized controlled trials: a review of trials published in leading medical journals. , 2010, Journal of clinical epidemiology.

[33]  Elizabeth A Stuart,et al.  Matching methods for causal inference: A review and a look forward. , 2010, Statistical science : a review journal of the Institute of Mathematical Statistics.

[34]  Gary King,et al.  MatchIt: Nonparametric Preprocessing for Parametric Causal Inference , 2011 .

[35]  P. Austin An Introduction to Propensity Score Methods for Reducing the Effects of Confounding in Observational Studies , 2011, Multivariate behavioral research.

[36]  P. Austin,et al.  Optimal caliper widths for propensity-score matching when estimating differences in means and differences in proportions in observational studies , 2010, Pharmaceutical statistics.

[37]  Peter C. Austin,et al.  A Tutorial and Case Study in Propensity Score Analysis: An Application to Estimating the Effect of In-Hospital Smoking Cessation Counseling on Mortality , 2011, Multivariate behavioral research.

[38]  P. Austin Comparing paired vs non-paired statistical methods of analyses when making inferences about absolute risk reductions in propensity-score matched samples , 2011, Statistics in medicine.

[39]  R. Kruse,et al.  Mortality following nursing home-acquired lower respiratory infection: LRI severity, antibiotic treatment, and water intake. , 2012, Journal of the American Medical Directors Association.

[40]  Peter C. Austin,et al.  Using Ensemble-Based Methods for Directly Estimating Causal Effects: An Investigation of Tree-Based G-Computation , 2012, Multivariate behavioral research.

[41]  I. Tager,et al.  The value of health interventions: evaluating asthma case management using matching , 2012 .

[42]  D B Rubin,et al.  Robust estimation of causal effects of binary treatments in unconfounded studies with dichotomous outcomes , 2013, Statistics in medicine.

[43]  Peter C Austin,et al.  The performance of different propensity score methods for estimating marginal hazard ratios , 2007, Statistics in medicine.

[44]  Dylan S. Small,et al.  The use of bootstrapping when using propensity-score matching without replacement: a simulation study , 2014, Statistics in medicine.

[45]  Peter C Austin,et al.  A comparison of 12 algorithms for matching on the propensity score , 2013, Statistics in medicine.

[46]  J. Haukoos,et al.  The Propensity Score. , 2015, JAMA.

[47]  Elizabeth A. Stuart,et al.  Optimal full matching for survival outcomes: a method that merits more widespread use , 2015, Statistics in medicine.

[48]  E. Stuart,et al.  The performance of inverse probability of treatment weighting and full matching on the propensity score in the presence of model misspecification when estimating the effect of treatment on survival outcomes , 2015, Statistical methods in medical research.

[49]  D. Rubin,et al.  Estimation of causal effects of binary treatments in unconfounded studies with one continuous covariate , 2017, Statistical methods in medical research.

[50]  P. Austin Double propensity-score adjustment: A solution to design bias or bias due to incomplete matching , 2016, Statistical methods in medical research.