A tutorial comparing different covariate balancing methods with an application evaluating the causal effect of exercise on the progression of Huntington's Disease

Randomized controlled trials are the gold standard for measuring the causal effects of treatments on clinical outcomes. However, randomized trials are not always feasible, and causal treatment effects must, therefore, often be inferred from observational data. Observational study designs do not allow conclusions about causal relationships to be drawn unless statistical techniques are used to account for the imbalance of confounders across groups while key assumptions hold. Propensity score (PS) and balance weighting are two useful techniques that aim to reduce the imbalances between treatment groups by weighting the groups to look alike on the observed confounders. There are many methods available to estimate PS and balancing weights. However, it is unclear a priori which will achieve the best trade-off between covariate balance and effective sample size. Weighted analyses are further complicated by small studies with limited sample sizes, which is common when studying rare diseases. To address these issues, we present a step-by-step guide to covariate balancing strategies, including how to evaluate overlap, obtain estimates of PS and balancing weights, check for covariate balance, and assess sensitivity to unobserved confounding. We compare the performance of a number of commonly used estimation methods on a synthetic data set based on the Physical Activity and Exercise Outcomes in Huntington Disease (PACE-HD) study, which explored whether enhanced physical activity affects the progression and severity of the disease. We provide general guidelines for the choice of method for estimation of PS and balancing weights, interpretation, and sensitivity analysis of results. We also present R code for implementing the different methods and assessing balance.

[1]  I NICOLETTI,et al.  The Planning of Experiments , 1936, Rivista di clinica pediatrica.

[2]  Mitchell H. Gail,et al.  Critical Values for the One-Sided Two-Sample Kolmogorov-Smirnov Statistic , 1976 .

[3]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .

[4]  P. Holland Statistics and Causal Inference , 1985 .

[5]  A. Agresti An introduction to categorical data analysis , 1997 .

[6]  Douglas G Altman,et al.  Treatment allocation in controlled trials: why randomise? , 1999, BMJ.

[7]  J. Robins,et al.  Marginal Structural Models and Causal Inference in Epidemiology , 2000, Epidemiology.

[8]  J. Robins,et al.  Marginal structural models to estimate the causal effect of zidovudine on the survival of HIV-positive men. , 2000, Epidemiology.

[9]  G. Imbens,et al.  Estimation of Causal Effects using Propensity Score Weighting: An Application to Data on Right Heart Catheterization , 2001, Health Services and Outcomes Research Methodology.

[10]  D. McCaffrey,et al.  Propensity score estimation with boosted regression for evaluating causal effects in observational studies. , 2004, Psychological methods.

[11]  Marco Caliendo,et al.  Some Practical Guidance for the Implementation of Propensity Score Matching , 2005, SSRN Electronic Journal.

[12]  J. Robins,et al.  Doubly Robust Estimation in Missing Data and Causal Inference Models , 2005, Biometrics.

[13]  J. Avorn,et al.  Variable selection for propensity score models. , 2006, American journal of epidemiology.

[14]  Joseph Kang,et al.  Demystifying Double Robustness: A Comparison of Alternative Strategies for Estimating a Population Mean from Incomplete Data , 2007, 0804.2958.

[15]  Marie Davidian,et al.  Comment: Demystifying Double Robustness: A Comparison of Alternative Strategies for Estimating a Population Mean from Incomplete Data. , 2008, Statistical science : a review journal of the Institute of Mathematical Statistics.

[16]  S. Schneeweiss,et al.  Evaluating uses of data mining techniques in propensity score estimation: a simulation study , 2008, Pharmacoepidemiology and drug safety.

[17]  Michael A. Posner,et al.  COMPARING WEIGHTING METHODS IN PROPENSITY SCORE ANALYSIS , 2009 .

[18]  P. Austin Balance diagnostics for comparing the distribution of baseline covariates between treatment groups in propensity-score matched samples , 2009, Statistics in medicine.

[19]  Elizabeth A Stuart,et al.  Propensity score techniques and the assessment of measured covariate balance to test causal associations in psychological research. , 2010, Psychological methods.

[20]  P. Austin An Introduction to Propensity Score Methods for Reducing the Effects of Confounding in Observational Studies , 2011, Multivariate behavioral research.

[21]  Jens Hainmueller,et al.  Entropy Balancing for Causal Effects: A Multivariate Reweighting Method to Produce Balanced Samples in Observational Studies , 2012, Political Analysis.

[22]  Lane F Burgette,et al.  A tutorial on propensity score estimation for multiple treatments using generalized boosted models , 2013, Statistics in medicine.

[23]  Alan R. Ellis,et al.  The role of prediction modeling in propensity score estimation: an evaluation of logistic regression, bCART, and the covariate-balancing propensity score. , 2014, American journal of epidemiology.

[24]  Kristin L. Sainani,et al.  Logistic Regression , 2014, PM & R : the journal of injury, function, and rehabilitation.

[25]  Greg Ridgeway,et al.  Toolkit for Weighting and Analysis of Nonequivalent Groups , 2014 .

[26]  Daniel Almirall,et al.  Estimating the causal effects of cumulative treatment episodes for adolescents using marginal structural models and inverse probability of treatment weighting. , 2014, Drug and alcohol dependence.

[27]  Howard Zisser,et al.  Physical activity and exercise. , 2014, Diabetes technology & therapeutics.

[28]  D. Meier,et al.  Methods for constructing and assessing propensity scores. , 2014, Health services research.

[29]  Jeremy A Rassen,et al.  Metrics for covariate balance in cohort studies of causal effects , 2014, Statistics in medicine.

[30]  K. Imai,et al.  Covariate balancing propensity score , 2014 .

[31]  Antonio Olmos,et al.  A Practical Guide for Using Propensity Score Weighting in R. , 2015 .

[32]  Gillian M. Raab,et al.  synthpop: Bespoke Creation of Synthetic Data in R , 2016 .

[33]  M Sanni Ali,et al.  Best (but oft-forgotten) practices: propensity score methods in clinical nutrition research. , 2016, The American journal of clinical nutrition.

[34]  Melissa Nichols,et al.  Comparison of Propensity Score Methods and Covariate Adjustment: Evaluation in 4 Cardiovascular Studies. , 2016, Journal of the American College of Cardiology.

[35]  Jaehoon Lee,et al.  A practical guide to propensity score analysis for applied clinical research. , 2017, Behaviour research and therapy.

[36]  Daniel Almirall,et al.  Chasing Balance and Other Recommendations for Improving Nonparametric Propensity Score Models , 2017, Journal of causal inference.

[37]  Daniel Almirall,et al.  The Right Tool for the Job: Choosing Between Covariate-balancing and Generalized Boosted Model Propensity Scores , 2017, Epidemiology.

[38]  Kari Lock Morgan,et al.  Balancing Covariates via Propensity Score Weighting , 2014, 1609.07494.

[39]  Sofia Ramiro,et al.  Three handy tips and a practical guide to improve your propensity score models , 2019, RMD Open.

[40]  Byeong Yeob Choi,et al.  Power comparison for propensity score methods , 2018, Comput. Stat..

[41]  Beth Ann Griffin,et al.  Physical ACtivity and Exercise Outcomes in Huntington Disease (PACE-HD): Protocol for a 12-Month Trial Within Cohort Evaluation of a Physical Activity Intervention in People With Huntington Disease. , 2019, Physical therapy.

[42]  Yuying Xie,et al.  A model averaging approach for estimating propensity scores by optimizing balance , 2019, Statistical methods in medical research.

[43]  Tomas Dolezal,et al.  Propensity Score Weighting Using Overlap Weights: A New Method Applied to Regorafenib Clinical Data and a Cost-Effectiveness Analysis. , 2019, Value in health : the journal of the International Society for Pharmacoeconomics and Outcomes Research.

[44]  Beth Ann Griffin,et al.  Expanding outcomes when considering the relative effectiveness of two evidence-based outpatient treatment programs for adolescents. , 2020, Journal of substance abuse treatment.