Optimal full matching for survival outcomes: a method that merits more widespread use

Summary Matching on the propensity score is a commonly used analytic method for estimating the effects of treatments on outcomes. Commonly used propensity score matching methods include nearest neighbor matching and nearest neighbor caliper matching. Rosenbaum (1991) proposed an optimal full matching approach, in which matched strata are formed consisting of either one treated subject and at least one control subject or one control subject and at least one treated subject. Full matching has been used rarely in the applied literature. Furthermore, its performance for use with survival outcomes has not been rigorously evaluated. We propose a method to use full matching to estimate the effect of treatment on the hazard of the occurrence of the outcome. An extensive set of Monte Carlo simulations were conducted to examine the performance of optimal full matching with survival analysis. Its performance was compared with that of nearest neighbor matching, nearest neighbor caliper matching, and inverse probability of treatment weighting using the propensity score. Full matching has superior performance compared with that of the two other matching algorithms and had comparable performance with that of inverse probability of treatment weighting using the propensity score. We illustrate the application of full matching with survival outcomes to estimate the effect of statin prescribing at hospital discharge on the hazard of post‐discharge mortality in a large cohort of patients who were discharged from hospital with a diagnosis of acute myocardial infarction. Optimal full matching merits more widespread adoption in medical and epidemiological research. © 2015 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.

[1]  P. Rosenbaum A Characterization of Optimal Designs for Observational Studies , 1991 .

[2]  B. Hansen Full Matching in an Observational Study of Coaching for the SAT , 2004 .

[3]  Peter C Austin,et al.  Predicting mortality among patients hospitalized for heart failure: derivation and validation of a clinical model. , 2003, JAMA.

[4]  T. Shakespeare,et al.  Observational Studies , 2003 .

[5]  P. Rosenbaum,et al.  Substantial Gains in Bias Reduction from Matching with a Variable Number of Controls , 2000, Biometrics.

[6]  Joseph Kang,et al.  Demystifying Double Robustness: A Comparison of Alternative Strategies for Estimating a Population Mean from Incomplete Data , 2007, 0804.2958.

[7]  E. Stuart,et al.  The performance of inverse probability of treatment weighting and full matching on the propensity score in the presence of model misspecification when estimating the effect of treatment on survival outcomes , 2015, Statistical methods in medical research.

[8]  Peter C Austin,et al.  A critical appraisal of propensity‐score matching in the medical literature between 1996 and 2003 , 2008, Statistics in medicine.

[9]  Ralf Bender,et al.  Generating survival times to simulate Cox proportional hazards models , 2005, Statistics in medicine.

[10]  Gary King,et al.  MatchIt: Nonparametric Preprocessing for Parametric Causal Inference , 2011 .

[11]  Dylan S. Small,et al.  The use of bootstrapping when using propensity-score matching without replacement: a simulation study , 2014, Statistics in medicine.

[12]  Elizabeth A Stuart,et al.  Matching methods for causal inference: A review and a look forward. , 2010, Statistical science : a review journal of the Institute of Mathematical Statistics.

[13]  Patrick Royston,et al.  The design of simulation studies in medical statistics , 2006, Statistics in medicine.

[14]  R. Kruse,et al.  Mortality following nursing home-acquired lower respiratory infection: LRI severity, antibiotic treatment, and water intake. , 2012, Journal of the American Medical Directors Association.

[15]  P. Austin An Introduction to Propensity Score Methods for Reducing the Effects of Confounding in Observational Studies , 2011, Multivariate behavioral research.

[16]  P. Austin,et al.  Optimal caliper widths for propensity-score matching when estimating differences in means and differences in proportions in observational studies , 2010, Pharmaceutical statistics.

[17]  Peter C Austin,et al.  A comparison of 12 algorithms for matching on the propensity score , 2013, Statistics in medicine.

[18]  P. Austin The use of propensity score methods with survival or time-to-event outcomes: reporting measures of effect similar to those used in randomized experiments , 2013, Statistics in medicine.

[19]  Peter C Austin,et al.  The performance of different propensity score methods for estimating marginal hazard ratios , 2007, Statistics in medicine.

[20]  Peter C. Austin,et al.  A Tutorial and Case Study in Propensity Score Analysis: An Application to Estimating the Effect of In-Hospital Smoking Cessation Counseling on Mortality , 2011, Multivariate behavioral research.

[21]  Andrea Manca,et al.  A substantial and confusing variation exists in handling of baseline covariates in randomized controlled trials: a review of trials published in leading medical journals. , 2010, Journal of clinical epidemiology.

[22]  D. Rubin,et al.  Reducing Bias in Observational Studies Using Subclassification on the Propensity Score , 1984 .

[23]  Gary King,et al.  Matching as Nonparametric Preprocessing for Reducing Model Dependence in Parametric Causal Inference , 2007, Political Analysis.

[24]  Til Stürmer,et al.  A review of the application of propensity score methods yielded increasing use, advantages in specific settings, but not substantially different estimates compared with conventional multivariable methods. , 2006, Journal of clinical epidemiology.

[25]  P. Austin American Journal of Epidemiology Practice of Epidemiology Statistical Criteria for Selecting the Optimal Number of Untreated Subjects Matched to Each Treated Subject When Using Many-to-one Matching on the Propensity Score , 2022 .

[26]  R. Porcher,et al.  Propensity score applied to survival data analysis through proportional hazards models: a Monte Carlo study , 2012, Pharmaceutical statistics.

[27]  Peter C Austin,et al.  The performance of different propensity-score methods for estimating relative risks. , 2008, Journal of clinical epidemiology.

[28]  D. Rubin,et al.  Constructing a Control Group Using Multivariate Matched Sampling Methods That Incorporate the Propensity Score , 1985 .

[29]  W. Coryell,et al.  Antiepileptic drugs for bipolar disorder and the risk of suicidal behavior: a 30-year observational study. , 2012, The American journal of psychiatry.

[30]  G. King,et al.  Causal Inference without Balance Checking: Coarsened Exact Matching , 2012, Political Analysis.

[31]  P. Rosenbaum Model-Based Direct Adjustment , 1987 .

[32]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .

[33]  Harold I Feldman,et al.  Model Selection, Confounder Control, and Marginal Structural Models , 2004 .

[34]  Paul R. Rosenbaum,et al.  Comparison of Multivariate Matching Methods: Structures, Distances, and Algorithms , 1993 .

[35]  Peter C Austin,et al.  Conditioning on the propensity score can result in biased estimation of common measures of treatment effect: a Monte Carlo study , 2007, Statistics in medicine.

[36]  Jennifer M. Polinski,et al.  Plasmode simulation for the evaluation of pharmacoepidemiologic methods in complex healthcare databases , 2014, Comput. Stat. Data Anal..

[37]  H. Krumholz,et al.  Factors associated with racial differences in myocardial infarction outcomes. , 2009, Annals of internal medicine.

[38]  Peter C Austin,et al.  Propensity-score matching in the cardiovascular surgery literature from 2004 to 2006: a systematic review and suggestions for improvement. , 2007, The Journal of thoracic and cardiovascular surgery.

[39]  Donald B Rubin,et al.  On principles for modeling propensity scores in medical research , 2004, Pharmacoepidemiology and drug safety.

[40]  Peter C Austin,et al.  The performance of different propensity score methods for estimating marginal odds ratios, Statistics in Medicine 2007; 26:3078–3094 , 2008 .

[41]  Thorsten M. Buzug,et al.  IR-04-067 Adaptive Speciation : Introduction , 2004 .

[42]  James Stafford,et al.  The Performance of Two Data-Generation Processes for Data with Specified Marginal Treatment Odds Ratios , 2008, Commun. Stat. Simul. Comput..

[43]  P. Austin,et al.  Reader's guide to critical appraisal of cohort studies: 2. Assessing potential for confounding , 2005, BMJ : British Medical Journal.

[44]  J. Schafer,et al.  Average causal effects from nonrandomized studies: a practical guide and simulated example. , 2008, Psychological methods.

[45]  Peter C. Austin,et al.  A Data-Generation Process for Data with Specified Risk Differences or Numbers Needed to Treat , 2010, Commun. Stat. Simul. Comput..

[46]  Peter C Austin,et al.  A comparison of the ability of different propensity score models to balance measured variables between treated and untreated subjects: a Monte Carlo study , 2007, Statistics in medicine.

[47]  Peter C Austin,et al.  The performance of different propensity-score methods for estimating differences in proportions (risk differences or absolute risk reductions) in observational studies , 2010, Statistics in medicine.

[48]  Jerome P. Reiter,et al.  Interval estimation for treatment effects using propensity score matching , 2006, Statistics in medicine.

[49]  D. Hedeker,et al.  Two propensity score‐based strategies for a three‐decade observational study: investigating psychotropic medications and suicide risk , 2012, Statistics in medicine.

[50]  B. Hansen,et al.  Optimal Full Matching and Related Designs via Network Flows , 2006 .

[51]  L. J. Wei,et al.  The Robust Inference for the Cox Proportional Hazards Model , 1989 .

[52]  D. Rubin Using Propensity Scores to Help Design Observational Studies: Application to the Tobacco Litigation , 2001, Health Services and Outcomes Research Methodology.

[53]  Peter C Austin,et al.  Report Card on Propensity-Score Matching in the Cardiology Literature From 2004 to 2006: A Systematic Review , 2008, Circulation. Cardiovascular quality and outcomes.

[54]  P. Austin Balance diagnostics for comparing the distribution of baseline covariates between treatment groups in propensity-score matched samples , 2009, Statistics in medicine.

[55]  Peter C Austin,et al.  Effectiveness of public report cards for improving the quality of cardiac care: the EFFECT study: a randomized trial. , 2009, JAMA.