Conditioning on the propensity score can result in biased estimation of common measures of treatment effect: a Monte Carlo study

Propensity score methods are increasingly being used to estimate causal treatment effects in the medical literature. Conditioning on the propensity score results in unbiased estimation of the expected difference in observed responses to two treatments. The degree to which conditioning on the propensity score introduces bias into the estimation of the conditional odds ratio or conditional hazard ratio, which are frequently used as measures of treatment effect in observational studies, has not been extensively studied. We conducted Monte Carlo simulations to determine the degree to which propensity score matching, stratification on the quintiles of the propensity score, and covariate adjustment using the propensity score result in biased estimation of conditional odds ratios, hazard ratios, and rate ratios. We found that conditioning on the propensity score resulted in biased estimation of the true conditional odds ratio and the true conditional hazard ratio. In all scenarios examined, treatment effects were biased towards the null treatment effect. However, conditioning on the propensity score did not result in biased estimation of the true conditional rate ratio. In contrast, conventional regression methods allowed unbiased estimation of the true conditional treatment effect when all variables associated with the outcome were included in the regression model. The observed bias in propensity score methods is due to the fact that regression models allow one to estimate conditional treatment effects, whereas propensity score methods allow one to estimate marginal treatment effects. In several settings with non-linear treatment effects, marginal and conditional treatment effects do not coincide.

[1]  Paul R Rosenbaum,et al.  Rare Outcomes, Common Treatments: Analytic Strategies Using Propensity Scores , 2002, Annals of Internal Medicine.

[2]  Xiao-Hua Zhou,et al.  The use of propensity scores in pharmacoepidemiologic research , 2000, Pharmacoepidemiology and drug safety.

[3]  S Greenland,et al.  Interpretation and choice of effect measures in epidemiologic analyses. , 1987, American journal of epidemiology.

[4]  J. Fleiss Statistical methods for rates and proportions , 1974 .

[5]  P. Donnan,et al.  Propensity score methods in drug safety studies: practice, strengths and limitations , 2001, Pharmacoepidemiology and drug safety.

[6]  S. Gutnikov,et al.  From subgroups to individuals: general principles and the example of carotid endarterectomy , 2005, The Lancet.

[7]  Peter C Austin,et al.  Propensity score methods gave similar results to traditional regression modeling in observational studies: a systematic review. , 2005, Journal of clinical epidemiology.

[8]  Ralf Bender,et al.  Generating survival times to simulate Cox proportional hazards models , 2005, Statistics in medicine.

[9]  Peter C Austin,et al.  A comparison of propensity score methods: a case‐study estimating the effectiveness of post‐AMI statin use , 2006, Statistics in medicine.

[10]  D. Rubin,et al.  Reducing Bias in Observational Studies Using Subclassification on the Propensity Score , 1984 .

[11]  Alan Agresti,et al.  Effects and non‐effects of paired identical observations in comparing proportions with binary matched‐pairs data , 2004, Statistics in medicine.

[12]  M. Gail,et al.  Biased estimates of treatment effect in randomized experiments with nonlinear regressions and omitted covariates , 1984 .

[13]  R. D'Agostino Adjustment Methods: Propensity Score Methods for Bias Reduction in the Comparison of a Treatment to a Non‐Randomized Control Group , 2005 .

[14]  W. G. Cochran The effectiveness of adjustment by subclassification in removing bias in observational studies. , 1968, Biometrics.

[15]  Vincent Mor,et al.  Principles for modeling propensity scores in medical research: a systematic literature review , 2004, Pharmacoepidemiology and drug safety.

[16]  J. Pearl,et al.  Confounding and Collapsibility in Causal Inference , 1999 .

[17]  P. Austin,et al.  The use of the propensity score for estimating treatment effects: administrative versus clinical data , 2005, Statistics in medicine.

[18]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .