The role of prediction modeling in propensity score estimation: an evaluation of logistic regression, bCART, and the covariate-balancing propensity score.

The covariate-balancing propensity score (CBPS) extends logistic regression to simultaneously optimize covariate balance and treatment prediction. Although the CBPS has been shown to perform well in certain settings, its performance has not been evaluated in settings specific to pharmacoepidemiology and large database research. In this study, we use both simulations and empirical data to compare the performance of the CBPS with logistic regression and boosted classification and regression trees. We simulated various degrees of model misspecification to evaluate the robustness of each propensity score (PS) estimation method. We then applied these methods to compare the effect of initiating glucagonlike peptide-1 agonists versus sulfonylureas on cardiovascular events and all-cause mortality in the US Medicare population in 2007-2009. In simulations, the CBPS was generally more robust in terms of balancing covariates and reducing bias compared with misspecified logistic PS models and boosted classification and regression trees. All PS estimation methods performed similarly in the empirical example. For settings common to pharmacoepidemiology, logistic regression with balance checks to assess model specification is a valid method for PS estimation, but it can require refitting multiple models until covariate balance is achieved. The CBPS is a promising method to improve the robustness of PS models.

[1]  S. Schneeweiss,et al.  Evaluating uses of data mining techniques in propensity score estimation: a simulation study , 2008, Pharmacoepidemiology and drug safety.

[2]  Donald B Rubin,et al.  On principles for modeling propensity scores in medical research , 2004, Pharmacoepidemiology and drug safety.

[3]  Daniel Westreich,et al.  Propensity score estimation: neural networks, support vector machines, decision trees (CART), and meta-classifiers as alternatives to logistic regression. , 2010, Journal of clinical epidemiology.

[4]  Til Stürmer,et al.  The role of the c‐statistic in variable selection for propensity score models , 2011, Pharmacoepidemiology and drug safety.

[5]  K. Imai,et al.  Covariate balancing propensity score , 2014 .

[6]  J. Myers,et al.  Effects of adjusting for instrumental variables on bias and precision of effect estimates. , 2011, American journal of epidemiology.

[7]  P. Rosenbaum Model-Based Direct Adjustment , 1987 .

[8]  Brian K. Lee,et al.  Weight Trimming and Propensity Score Weighting , 2011, PloS one.

[9]  J. Avorn,et al.  Variable selection for propensity score models. , 2006, American journal of epidemiology.

[10]  James M. Robins,et al.  Association, Causation, And Marginal Structural Models , 1999, Synthese.

[11]  D. McCaffrey,et al.  Propensity score estimation with boosted regression for evaluating causal effects in observational studies. , 2004, Psychological methods.

[12]  P. Austin An Introduction to Propensity Score Methods for Reducing the Effects of Confounding in Observational Studies , 2011, Multivariate behavioral research.

[13]  Jens Hainmueller,et al.  Entropy Balancing for Causal Effects: A Multivariate Reweighting Method to Produce Balanced Samples in Observational Studies , 2012, Political Analysis.

[14]  R. D'Agostino Adjustment Methods: Propensity Score Methods for Bias Reduction in the Comparison of a Treatment to a Non‐Randomized Control Group , 2005 .

[15]  M Soledad Cepeda,et al.  Comparison of logistic regression versus propensity score when the number of events is low and there are multiple confounders. , 2003, American journal of epidemiology.

[16]  Til Stürmer,et al.  A review of the application of propensity score methods yielded increasing use, advantages in specific settings, but not substantially different estimates compared with conventional multivariable methods. , 2006, Journal of clinical epidemiology.

[17]  Elizabeth A Stuart,et al.  Improving propensity score weighting using machine learning , 2010, Statistics in medicine.

[18]  P. Austin Balance diagnostics for comparing the distribution of baseline covariates between treatment groups in propensity-score matched samples , 2009, Statistics in medicine.

[19]  Elizabeth A Stuart,et al.  On the joint use of propensity and prognostic scores in estimation of the average treatment effect on the treated: a simulation study , 2014, Statistics in medicine.

[20]  J. Pearl,et al.  Confounding and Collapsibility in Causal Inference , 1999 .

[21]  Til Stürmer,et al.  Nonexperimental comparative effectiveness research using linked healthcare databases. , 2011, Epidemiology.

[22]  Peter C. Austin,et al.  Using Ensemble-Based Methods for Directly Estimating Causal Effects: An Investigation of Tree-Based G-Computation , 2012, Multivariate behavioral research.

[23]  R. D'Agostino Propensity score methods for bias reduction in the comparison of a treatment to a non-randomized control group. , 2005, Statistics in medicine.

[24]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .

[25]  J. Robins Data, Design, and Background Knowledge in Etiologic Inference , 2001, Epidemiology.

[26]  Gary King,et al.  Misunderstandings between experimentalists and observationalists about causal inference , 2008 .

[27]  B. Graham,et al.  Inverse Probability Tilting for Moment Condition Models with Missing Data , 2008 .