Estimation of causal effects with multiple treatments: a review and new ideas

The propensity score is a common tool for estimating the causal effect of a binary treatment in observational data. In this setting, matching, subclassification, imputation, or inverse probability weighting on the propensity score can reduce the initial covariate bias between the treatment and control groups. With more than two treatment options, however, estimation of causal effects requires additional assumptions and techniques, the implementations of which have varied across disciplines. This paper reviews current methods, and it identifies and contrasts the treatment effects that each one estimates. Additionally, we propose possible matching techniques for use with multiple, nominal categorical treatments, and use simulations to show how such algorithms can yield improved covariate similarity between those in the matched sets, relative the pre-matched cohort. To sum, this manuscript provides a synopsis of how to notate and use causal methods for categorical treatments.

[1]  Richard M. Karp,et al.  Reducibility Among Combinatorial Problems , 1972, 50 Years of Integer Programming.

[2]  D. Rubin Matched Sampling for Causal Effects: Matching to Remove Bias in Observational Studies , 1973 .

[3]  Donald B. Rubin,et al.  Multivariate matching methods that are equal percent bias reducing , 1974 .

[4]  D. Quade Using Weighted Rankings in the Analysis of Complete Blocks with Additive Block Effects , 1979 .

[5]  D. Rubin,et al.  Using Multivariate Matched Sampling and Regression Adjustment to Control Bias in Observational Studies , 1978 .

[6]  P. McCullagh Regression Models for Ordinal Data , 1980 .

[7]  Richard A. Johnson,et al.  Applied Multivariate Statistical Analysis , 1983 .

[8]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .

[9]  D. Rubin,et al.  Reducing Bias in Observational Studies Using Subclassification on the Propensity Score , 1984 .

[10]  J. Bezdek,et al.  FCM: The fuzzy c-means clustering algorithm , 1984 .

[11]  D. Rubin,et al.  Constructing a Control Group Using Multivariate Matched Sampling Methods That Incorporate the Propensity Score , 1985 .

[12]  C. Glymour,et al.  STATISTICS AND CAUSAL INFERENCE , 1985 .

[13]  R. Little Missing-Data Adjustments in Large Surveys , 1988 .

[14]  T. Speed,et al.  On the Application of Probability Theory to Agricultural Experiments. Essay on Principles. Section 9 , 1990 .

[15]  D. Rubin [On the Application of Probability Theory to Agricultural Experiments. Essay on Principles. Section 9.] Comment: Neyman (1923) and Causal Inference in Experiments and Observational Studies , 1990 .

[16]  P. Rosenbaum A Characterization of Optimal Designs for Observational Studies , 1991 .

[17]  Horton,et al.  Annals of Oncology , 1991, Springer US.

[18]  R. Sakia The Box-Cox transformation technique: a review , 1992 .

[19]  Donald B. Rubin,et al.  Characterizing the effect of matching using linear propensity score methods with normal distributions , 1992 .

[20]  Donald B. Rubin,et al.  Affinely Invariant Matching Methods with Ellipsoidal Distributions , 1992 .

[21]  Robert Tibshirani,et al.  An Introduction to the Bootstrap , 1994 .

[22]  S. T. Buckland,et al.  An Introduction to the Bootstrap. , 1994 .

[23]  P. Sprent,et al.  19. Applied Nonparametric Statistical Methods , 1995 .

[24]  D B Rubin,et al.  Matching using estimated propensity scores: relating theory to practice. , 1996, Biometrics.

[25]  T. Shakespeare,et al.  Observational Studies , 2003 .

[26]  R. D'Agostino Propensity score methods for bias reduction in the comparison of a treatment to a non-randomized control group. , 2005, Statistics in medicine.

[27]  P. Rosenbaum,et al.  Invited commentary: propensity scores. , 1999, American journal of epidemiology.

[28]  M. Lechner Identification and Estimation of Causal Effects of Multiple Treatments Under the Conditional Independence Assumption , 1999, SSRN Electronic Journal.

[29]  G. Imbens The Role of the Propensity Score in Estimating Dose-Response Functions , 1999 .

[30]  J. Robins,et al.  Marginal Structural Models and Causal Inference in Epidemiology , 2000, Epidemiology.

[31]  Elaine L. Zanutto,et al.  Matching With Doses in an Observational Study of a Media Campaign Against Drug Abuse , 2001, Journal of the American Statistical Association.

[32]  Rajeev Dehejia,et al.  Propensity Score-Matching Methods for Nonexperimental Causal Studies , 2002, Review of Economics and Statistics.

[33]  M. Lechner Program Heterogeneity and Propensity Score Matching: An Application to the Evaluation of Active Labor Market Policies , 2002, Review of Economics and Statistics.

[34]  S. Purdon,et al.  The use of propensity score matching in the evaluation of active labour market policies , 2002 .

[35]  G. Imbens,et al.  Large Sample Properties of Matching Estimators for Average Treatment Effects , 2004 .

[36]  B. McNeil,et al.  "Renalism": inappropriately low rates of coronary angiography in elderly individuals with renal insufficiency. , 2004, Journal of the American Society of Nephrology : JASN.

[37]  D. Rubin Using Propensity Scores to Help Design Observational Studies: Application to the Tobacco Litigation , 2001, Health Services and Outcomes Research Methodology.

[38]  Kosuke Imai,et al.  Causal Inference With General Treatment Regimes , 2004 .

[39]  D. McCaffrey,et al.  Propensity score estimation with boosted regression for evaluating causal effects in observational studies. , 2004, Psychological methods.

[40]  M. Parmar,et al.  Clinical trials in ovarian carcinoma: study methodology. , 2005, Annals of oncology : official journal of the European Society for Medical Oncology.

[41]  Sharon-Lise T Normand,et al.  On the use of discrete choice models for causal inference , 2005, Statistics in medicine.

[42]  Elaine L. Zanutto,et al.  Estimating causal effects of public health education campaigns using propensity score methodology , 2005 .

[43]  Elaine L. Zanutto,et al.  Using Propensity Score Subclassification for Multiple Treatment Doses to Evaluate a National Antidrug Media Campaign , 2005 .

[44]  Marco Caliendo,et al.  Some Practical Guidance for the Implementation of Propensity Score Matching , 2005, SSRN Electronic Journal.

[45]  Andreas C. Drichoutis,et al.  Nutrition knowledge and consumer use of nutritional food labels , 2005 .

[46]  Jerome P. Reiter,et al.  Interval estimation for treatment effects using propensity score matching , 2006, Statistics in medicine.

[47]  Joseph Kang,et al.  Demystifying Double Robustness: A Comparison of Alternative Strategies for Estimating a Population Mean from Incomplete Data , 2007, 0804.2958.

[48]  R. Dorsett The new deal for young people: effect on the labour market status of young men , 2006 .

[49]  G. Imbens,et al.  On the Failure of the Bootstrap for Matching Estimators , 2006 .

[50]  D. Brotman,et al.  Association of impaired diurnal blood pressure variation with a subsequent decline in glomerular filtration rate. , 2006, Archives of internal medicine.

[51]  Douglas G Altman,et al.  Dichotomizing continuous predictors in multiple regression: a bad idea , 2006, Statistics in medicine.

[52]  Marie Davidian,et al.  Comment: Demystifying Double Robustness: A Comparison of Alternative Strategies for Estimating a Population Mean from Incomplete Data. , 2008, Statistical science : a review journal of the Institute of Mathematical Statistics.

[53]  S. Schneeweiss,et al.  Risk of death associated with the use of conventional versus atypical antipsychotic drugs among elderly patients , 2007, Canadian Medical Association Journal.

[54]  Donald B. Rubin,et al.  BEST PRACTICES IN QUASI- EXPERIMENTAL DESIGNS Matching Methods for Causal Inference , 2007 .

[55]  Peter C Austin,et al.  A comparison of the ability of different propensity score models to balance measured variables between treated and untreated subjects: a Monte Carlo study , 2007, Statistics in medicine.

[56]  G. Filardo,et al.  Obesity and stroke after cardiac surgery: the impact of grouping body mass index. , 2007, The Annals of thoracic surgery.

[57]  Jasjeet S. Sekhon,et al.  Multivariate and Propensity Score Matching Software with Automated Balance Optimization: The Matching Package for R , 2008 .

[58]  K. Mccartney,et al.  Does higher quality early child care promote low-income children's math and reading achievement in middle childhood? , 2009, Child development.

[59]  Alan D. Jagolinzer,et al.  Chief Executive Officer Equity Incentives and Accounting Irregularities , 2009 .

[60]  D. Rubin,et al.  Testing treatment effects in unconfounded studies under model misspecification: Logistic regression, discretization, and their combination , 2009, Statistics in medicine.

[61]  R. Alvarez,et al.  Measuring the Effects of Voter Confidence on Political Participation: An Application to the 2006 Mexican Election , 2009 .

[62]  P. Austin Balance diagnostics for comparing the distribution of baseline covariates between treatment groups in propensity-score matched samples , 2009, Statistics in medicine.

[63]  Richard K. Crump,et al.  Dealing with limited overlap in estimation of average treatment effects , 2009 .

[64]  G. Filardo,et al.  Relation of obesity to atrial fibrillation after isolated coronary artery bypass grafting. , 2009, The American journal of cardiology.

[65]  J. Hagenaars,et al.  The Multiple Propensity Score as Control for Bias in the Comparison of More Than Two Treatment Arms: An Introduction From a Case Study in Mental Health , 2010, Medical care.

[66]  Zhiqiang Tan,et al.  Bounded, efficient and doubly robust estimation with inverse weighting , 2010 .

[67]  Elizabeth A Stuart,et al.  Matching methods for causal inference: A review and a look forward. , 2010, Statistical science : a review journal of the Institute of Mathematical Statistics.

[68]  Donald B Rubin,et al.  On the limitations of comparative effectiveness research , 2010, Statistics in medicine.

[69]  M. van Ham,et al.  Understanding Neighbourhood Effects: Selection Bias and Residential Mobility , 2010, SSRN Electronic Journal.

[70]  I. Akresh,et al.  Latino Immigrants and the U.S. Racial Order , 2010 .

[71]  Anirban Basu,et al.  A CTSA Agenda to Advance Methods for Comparative Effectiveness Research , 2011, Clinical and translational science.

[72]  Brian K. Lee,et al.  Weight Trimming and Propensity Score Weighting , 2011, PloS one.

[73]  P. Nieuwbeerta,et al.  DOES THE TIME CAUSE THE CRIME? AN EXAMINATION OF THE RELATIONSHIP BETWEEN TIME SERVED AND REOFFENDING IN THE NETHERLANDS† , 2011 .

[74]  P. Austin,et al.  Optimal caliper widths for propensity-score matching when estimating differences in means and differences in proportions in observational studies , 2010, Pharmaceutical statistics.

[75]  J. Rassen,et al.  Simultaneously assessing intended and unintended treatment effects of multiple treatment options: a pragmatic “matrix design” , 2011, Pharmacoepidemiology and drug safety.

[76]  Vasilios D. Kosteas The Effect of Exercise on Earnings: Evidence from the NLSY , 2012 .

[77]  E. Hade PROPENSITY SCORE ADJUSTMENT IN MULTIPLE GROUP OBSERVATIONAL STUDIES: COMPARING MATCHING AND ALTERNATIVE METHODS , 2012 .

[78]  G. King,et al.  Causal Inference without Balance Checking: Coarsened Exact Matching , 2012, Political Analysis.

[79]  Xiao-Hua Zhou,et al.  Generalized propensity score for estimating the average treatment effect of multiple treatments , 2012, Statistics in medicine.

[80]  J. Zubizarreta Journal of the American Statistical Association Using Mixed Integer Programming for Matching in an Observational Study of Kidney Failure after Surgery Using Mixed Integer Programming for Matching in an Observational Study of Kidney Failure after Surgery , 2022 .

[81]  D B Rubin,et al.  Robust estimation of causal effects of binary treatments in unconfounded studies with dichotomous outcomes , 2013, Statistics in medicine.

[82]  V. Mor,et al.  Different analyses estimate different parameters of the effect of erythropoietin stimulating agents on survival in end stage renal disease: a comparison of payment policy analysis, instrumental variables, and multiple imputation of potential outcomes. , 2013, Journal of clinical epidemiology.

[83]  Chunhao Tu,et al.  Comparison of clustering algorithms on generalized propensity score in observational studies: a simulation study , 2013 .

[84]  Lane F Burgette,et al.  A tutorial on propensity score estimation for multiple treatments using generalized boosted models , 2013, Statistics in medicine.

[85]  K. Rothman,et al.  Exploring large weight deletion and the ability to balance confounders when using inverse probability of treatment weighting in the presence of rare treatment decisions , 2013, Pharmacoepidemiology and drug safety.

[86]  Jeremy A Rassen,et al.  Matching by Propensity Score in Cohort Studies with Three Treatment Groups , 2013, Epidemiology.

[87]  Dylan S. Small,et al.  The use of bootstrapping when using propensity-score matching without replacement: a simulation study , 2014, Statistics in medicine.

[88]  K. Imai,et al.  Covariate balancing propensity score , 2014 .

[89]  Bo Lu,et al.  Bias associated with using the estimated propensity score as a regression covariate , 2014, Statistics in medicine.

[90]  Roee Gutman,et al.  Estimation of causal effects of binary treatments in unconfounded studies , 2015, Statistics in medicine.

[91]  D. Rubin,et al.  Causal Inference for Statistics, Social, and Biomedical Sciences: Sensitivity Analysis and Bounds , 2015 .

[92]  Michael J. Lopez,et al.  Estimating the average treatment effects of nutritional label use using subclassification with regression adjustment , 2014, Statistical methods in medical research.