Improving propensity score weighting using machine learning

Machine learning techniques such as classification and regression trees (CART) have been suggested as promising alternatives to logistic regression for the estimation of propensity scores. The authors examined the performance of various CART-based propensity score models using simulated data. Hypothetical studies of varying sample sizes (n=500, 1000, 2000) with a binary exposure, continuous outcome, and 10 covariates were simulated under seven scenarios differing by degree of non-linear and non-additive associations between covariates and the exposure. Propensity score weights were estimated using logistic regression (all main effects), CART, pruned CART, and the ensemble methods of bagged CART, random forests, and boosted CART. Performance metrics included covariate balance, standard error, per cent absolute bias, and 95 per cent confidence interval (CI) coverage. All methods displayed generally acceptable performance under conditions of either non-linearity or non-additivity alone. However, under conditions of both moderate non-additivity and moderate non-linearity, logistic regression had subpar performance, whereas ensemble methods provided substantially better bias reduction and more consistent 95 per cent CI coverage. The results suggest that ensemble methods, especially boosted CART, may be useful for propensity score weighting.

[1]  S. Schneeweiss,et al.  Evaluating uses of data mining techniques in propensity score estimation: a simulation study , 2008, Pharmacoepidemiology and drug safety.

[2]  Donald B Rubin,et al.  On principles for modeling propensity scores in medical research , 2004, Pharmacoepidemiology and drug safety.

[3]  D. Rubin Using Propensity Scores to Help Design Observational Studies: Application to the Tobacco Litigation , 2001, Health Services and Outcomes Research Methodology.

[4]  G. Imbens,et al.  Estimation of Causal Effects using Propensity Score Weighting: An Application to Data on Right Heart Catheterization , 2001, Health Services and Outcomes Research Methodology.

[5]  William R Shadish,et al.  Propensity Scores , 2005, Evaluation review.

[6]  D. Rubin,et al.  Combining Propensity Score Matching with Additional Adjustments for Prognostic Covariates , 2000 .

[7]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[8]  Elizabeth A Stuart,et al.  Adolescent cannabis problems and young adult depression: male-female stratified propensity score analyses. , 2008, American journal of epidemiology.

[9]  D. McCaffrey,et al.  Propensity score estimation with boosted regression for evaluating causal effects in observational studies. , 2004, Psychological methods.

[10]  C. Drake Effects of misspecification of the propensity score on estimators of treatment effect , 1993 .

[11]  Chester Hartman,et al.  Rejoinder by the Author , 1965 .

[12]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .

[13]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[14]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[15]  J Elith,et al.  A working guide to boosted regression trees. , 2008, The Journal of animal ecology.

[16]  J. Lunceford,et al.  Stratification and weighting via the propensity score in estimation of causal treatment effects: a comparative study , 2004, Statistics in medicine.

[17]  Donald B. Rubin,et al.  Matching With Multiple Control Groups With Adjustment for Group Differences , 2008 .

[18]  Thomas Lumley,et al.  Analysis of Complex Survey Samples , 2004 .

[19]  Leo Breiman,et al.  Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author) , 2001 .

[20]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[21]  R. Marshall The use of classification and regression trees in clinical epidemiology. , 2001, Journal of clinical epidemiology.

[22]  Joseph Kang,et al.  Demystifying Double Robustness: A Comparison of Alternative Strategies for Estimating a Population Mean from Incomplete Data , 2007, 0804.2958.

[23]  R. D'Agostino Adjustment Methods: Propensity Score Methods for Bias Reduction in the Comparison of a Treatment to a Non‐Randomized Control Group , 2005 .

[24]  Donald B. Rubin,et al.  Matching with Multiple Control Groups , and Adjusting for Group Differences , 2005 .

[25]  J. Osborne Best Practices in Quantitative Methods , 2009 .

[26]  Donald B. Rubin,et al.  BEST PRACTICES IN QUASI- EXPERIMENTAL DESIGNS Matching Methods for Causal Inference , 2007 .

[27]  Leo Breiman,et al.  Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author) , 2001, Statistical Science.

[28]  J. Robins,et al.  Results of multivariable logistic regression, propensity matching, propensity adjustment, and propensity-based weighting under conditions of nonuniform effect. , 2006, American journal of epidemiology.

[29]  Gary King,et al.  Matching as Nonparametric Preprocessing for Reducing Model Dependence in Parametric Causal Inference , 2007, Political Analysis.

[30]  Richard A. Berk,et al.  An Introduction to Ensemble Methods for Data Analysis , 2004 .

[31]  ipred : Improved Predictors , 2009 .

[32]  Jeremy Arkes,et al.  Marijuana use and depression among adults: Testing for causal associations. , 2006, Addiction.

[33]  Daniel F. McCaffrey,et al.  Comment: Demystifying Double Robustness: A Comparison of Alternative Strategies for Estimating a Population Mean from Incomplete Data , 2008, 0804.2962.

[34]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[35]  Ashutosh Kumar Singh,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2010 .

[36]  J. Robins,et al.  Doubly Robust Estimation in Missing Data and Causal Inference Models , 2005, Biometrics.

[37]  Til Stürmer,et al.  Indications for propensity scores and review of their use in pharmacoepidemiology. , 2006, Basic & clinical pharmacology & toxicology.