Some Methods of Propensity‐Score Matching had Superior Performance to Others: Results of an Empirical Investigation and Monte Carlo simulations

Propensity-score matching is increasingly being used to reduce the impact of treatment-selection bias when estimating causal treatment effects using observational data. Several propensity-score matching methods are currently employed in the medical literature: matching on the logit of the propensity score using calipers of width either 0.2 or 0.6 of the standard deviation of the logit of the propensity score; matching on the propensity score using calipers of 0.005, 0.01, 0.02, 0.03, and 0.1; and 5 --> 1 digit matching on the propensity score. We conducted empirical investigations and Monte Carlo simulations to investigate the relative performance of these competing methods. Using a large sample of patients hospitalized with a heart attack and with exposure being receipt of a statin prescription at hospital discharge, we found that the 8 different methods produced propensity-score matched samples in which qualitatively equivalent balance in measured baseline variables was achieved between treated and untreated subjects. Seven of the 8 propensity-score matched samples resulted in qualitatively similar estimates of the reduction in mortality due to statin exposure. 5 --> 1 digit matching resulted in a qualitatively different estimate of relative risk reduction compared to the other 7 methods. Using Monte Carlo simulations, we found that matching using calipers of width of 0.2 of the standard deviation of the logit of the propensity score and the use of calipers of width 0.02 and 0.03 tended to have superior performance for estimating treatment effects.

[1]  K. McDonald,et al.  Effectiveness and cost-effectiveness of implantable cardioverter defibrillators in the treatment of ventricular arrhythmias among medicare beneficiaries. , 2002, The American journal of medicine.

[2]  Peter C Austin,et al.  A comparison of propensity score methods: a case‐study estimating the effectiveness of post‐AMI statin use , 2006, Statistics in medicine.

[3]  N. Smedira,et al.  Does off-pump coronary surgery reduce morbidity and mortality? ☆ ☆☆ , 2002 .

[4]  E. Lamont,et al.  Effectiveness of adjuvant fluorouracil in clinical practice: a population-based cohort study of elderly patients with stage III colon cancer. , 2002, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[5]  Peter C Austin,et al.  Propensity-score matching in the cardiovascular surgery literature from 2004 to 2006: a systematic review and suggestions for improvement. , 2007, The Journal of thoracic and cardiovascular surgery.

[6]  D. Rubin,et al.  Constructing a Control Group Using Multivariate Matched Sampling Methods That Incorporate the Propensity Score , 1985 .

[7]  A. Srinivasan,et al.  Effect of preoperative aspirin use in off-pump coronary artery bypass operations. , 2003, The Annals of thoracic surgery.

[8]  W. G. Cochran,et al.  Controlling Bias in Observational Studies: A Review. , 1974 .

[9]  D. Rubin,et al.  Reducing Bias in Observational Studies Using Subclassification on the Propensity Score , 1984 .

[10]  K. Wittkowski Effects and non‐effects of paired identical observations in comparing proportions with binary matched‐pairs data. By A. Agresti and Y. Min. Statistics in Medicine 2004; 23:65–75 , 2004, Statistics in medicine.

[11]  Peter C Austin,et al.  Conditioning on the propensity score can result in biased estimation of common measures of treatment effect: a Monte Carlo study , 2007, Statistics in medicine.

[12]  A. Walker,et al.  The effect of zanamivir treatment on influenza complications: a retrospective cohort study. , 2002, Clinical therapeutics.

[13]  Peter C Austin,et al.  A comparison of regression trees, logistic regression, generalized additive models, and multivariate adaptive regression splines for predicting AMI mortality , 2007, Statistics in medicine.

[14]  D. Bates,et al.  Relationship of pulmonary artery catheter use to mortality and resource utilization in patients with severe sepsis* , 2003, Critical care medicine.

[15]  R. Obenchain,et al.  Cost and Utilization Comparisons Among Propensity Score-Matched Insulin Lispro and Regular Insulin Users , 2003, Journal of managed care pharmacy : JMCP.

[16]  N. Christakis,et al.  The health impact of health care on families: a matched cohort study of hospice use by decedents and mortality outcomes in surviving, widowed spouses. , 2003, Social science & medicine.

[17]  P. Austin,et al.  Missed opportunities in the secondary prevention of myocardial infarction: an assessment of the effects of statin underprescribing on mortality. , 2006, American heart journal.

[18]  Peter C Austin,et al.  The performance of different propensity score methods for estimating marginal odds ratios, Statistics in Medicine 2007; 26:3078–3094 , 2008 .

[19]  H. Riedwyl,et al.  Standard Distance in Univariate and Multivariate Analysis , 1986 .

[20]  R. Cebul,et al.  Outcomes of rehabilitation services for nursing home residents. , 2003, Archives of physical medicine and rehabilitation.

[21]  Joan Buenconsejo,et al.  Impact of valve surgery on 6-month mortality in adults with complicated, left-sided native valve endocarditis: a propensity analysis. , 2003, JAMA.

[22]  K. McDonald,et al.  Effectiveness and cost-effectiveness of implantable cardioverter defibrillators in the treatment of ventricular arrhythmias among medicare beneficiaries , 2002 .

[23]  F. Sacks,et al.  A propensity score-matched cohort study of the effect of statins, mainly fluvastatin, on the occurrence of acute myocardial infarction. , 2003, The American journal of cardiology.

[24]  E. Peterson,et al.  Internal thoracic artery grafting in the elderly patient undergoing coronary artery bypass grafting: room for process improvement? , 2002, The Journal of thoracic and cardiovascular surgery.

[25]  P. Austin A comparison of classification and regression trees, logistic regression, generalized additive models, and multivariate adaptive regression splines for predicting AMI mortality , 2007 .

[26]  Peter C Austin,et al.  A critical appraisal of propensity‐score matching in the medical literature between 1996 and 2003 , 2008, Statistics in medicine.

[27]  Peter C Austin,et al.  Automated variable selection methods for logistic regression produced unstable models for predicting acute myocardial infarction mortality. , 2004, Journal of clinical epidemiology.

[28]  P. Austin Comparing Clinical Data with Administrative Data for Producing AMI Report Cards , 2006 .

[29]  Min Gao,et al.  Outcome of Mitral Valve Repair or Replacement: A Comparison by Propensity Score Analysis , 2003, Circulation.

[30]  Peter C Austin,et al.  The performance of different propensity-score methods for estimating relative risks. , 2008, Journal of clinical epidemiology.

[31]  P. Austin,et al.  The use of the propensity score for estimating treatment effects: administrative versus clinical data , 2005, Statistics in medicine.

[32]  Peter C. Austin,et al.  Comparing clinical data with administrative data for producing acute myocardial infarction report cards , 2006 .

[33]  L. Parsons,et al.  Primary angioplasty and selection bias inpatients presenting late (>12 h) after onset of chest pain and ST elevation myocardial infarction. , 2002, Journal of the American College of Cardiology.

[34]  M. Mack,et al.  Patient Selection and Current Practice Strategy for Off-pump Coronary Artery Bypass Surgery , 2003, Circulation.

[35]  T. Shakespeare,et al.  Observational Studies , 2003 .

[36]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .

[37]  E. Topol,et al.  In-hospital initiation of lipid-lowering therapy after coronary intervention as a predictor of long-term utilization: a propensity analysis. , 2003, Archives of internal medicine.

[38]  Lori S. Parsons Reducing Bias in a Propensity Score Matched-Pair Sample Using Greedy Matching Techniques , 2001 .

[39]  Peter C Austin,et al.  A comparison of the ability of different propensity score models to balance measured variables between treated and untreated subjects: a Monte Carlo study , 2007, Statistics in medicine.

[40]  E. Guadagnoli,et al.  Specialty of ambulatory care physicians and mortality among elderly patients after myocardial infarction. , 2002, The New England journal of medicine.