Sparse Bayesian Causal Forests for Heterogeneous Treatment Effects Estimation

This paper develops a sparsity-inducing version of Bayesian Causal Forests, a recently proposed nonparametric causal regression model that employs Bayesian Additive Regression Trees and is specifically designed to estimate heterogeneous treatment effects using observational data. The sparsity-inducing component we introduce is motivated by empirical studies where the number of pre-treatment covariates available is non-negligible, leading to different degrees of sparsity underlying the surfaces of interest in the estimation of individual treatment effects. The extended version presented in this work, which we name Sparse Bayesian Causal Forest, is equipped with an additional pair of priors allowing the model to adjust the weight of each covariate through the corresponding number of splits in the tree ensemble. These priors improve the model’s adaptability to sparse settings and allow to perform fully Bayesian variable selection in a framework for treatment effects estimation, and thus to uncover the moderating factors driving heterogeneity. In addition, the method allows prior knowledge about the relevant confounding pre-treatment covariates and the relative magnitude of their impact on the outcome to be incorporated in the model. We illustrate the performance of our method in simulated studies, in comparison to Bayesian Causal Forest and other state-of-the-art models, to demonstrate how it scales up with an increasing number of covariates and how it handles strongly confounded scenarios. Finally, we also provide an example of application using real-world data. This work was supported by a British Heart Foundation-Turing Cardiovascular Data Science Award (BCDSA/100003). Corresponding author: alberto.caron.19@ucl.ac.uk, 1-19 Torrington Pl, London WC1E 7HB. 1 ar X iv :2 10 2. 06 57 3v 1 [ st at .M E ] 1 2 Fe b 20 21

[1]  P. Robinson ROOT-N-CONSISTENT SEMIPARAMETRIC REGRESSION , 1988 .

[2]  J. Pearl Remarks on the method of propensity score , 2009, Statistics in medicine.

[3]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .

[4]  Xinkun Nie,et al.  Quasi-oracle estimation of heterogeneous treatment effects , 2017, Biometrika.

[5]  Aidong Zhang,et al.  Representation Learning for Treatment Effect Estimation from Observational Data , 2018, NeurIPS.

[6]  J. Heckman Sample selection bias as a specification error , 1979 .

[7]  Uri Shalit,et al.  Estimating individual treatment effect: generalization bounds and algorithms , 2016, ICML.

[8]  Kevin Leyton-Brown,et al.  Deep IV: A Flexible Approach for Counterfactual Prediction , 2017, ICML.

[9]  J. M. Taylor,et al.  Subgroup identification from randomized clinical trial data , 2011, Statistics in medicine.

[10]  Donald B. Rubin,et al.  Bayesian Inference for Causal Effects: The Role of Randomization , 1978 .

[11]  D. Rubin,et al.  Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction , 2016 .

[12]  Susan Athey,et al.  Recursive partitioning for heterogeneous causal effects , 2015, Proceedings of the National Academy of Sciences.

[13]  Jennifer L. Hill,et al.  Bayesian Nonparametric Modeling for Causal Inference , 2011 .

[14]  Christian P. Robert,et al.  Better together? Statistical learning in models made of modules , 2017, 1708.08719.

[15]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[16]  J. Tyson,et al.  Results at age 8 years of early intervention for low-birth-weight premature infants. The Infant Health and Development Program. , 1997, JAMA.

[17]  A. Dawid Causal Inference without Counterfactuals , 2000 .

[18]  M C McCormick,et al.  The contribution of low birth weight to infant mortality and childhood morbidity. , 1985, The New England journal of medicine.

[19]  Jennifer Hill,et al.  Automated versus Do-It-Yourself Methods for Causal Inference: Lessons Learned from a Data Analysis Competition , 2017, Statistical Science.

[20]  D. Green,et al.  Modeling Heterogeneous Treatment Effects in Survey Experiments with Bayesian Additive Regression Trees , 2012 .

[21]  Richard A. Nielsen,et al.  Why Propensity Scores Should Not Be Used for Matching , 2019, Political Analysis.

[22]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[23]  C. Carvalho,et al.  Regularization and Confounding in Linear Regression for Treatment Effect Estimation , 2016, 1602.02176.

[24]  P. Holland Statistics and Causal Inference , 1985 .

[25]  Joshua D. Angrist,et al.  Mostly Harmless Econometrics: An Empiricist's Companion , 2008 .

[26]  G. Baio,et al.  Estimating individual treatment effects using non‐parametric regression models: A review , 2020, Journal of the Royal Statistical Society: Series A (Statistics in Society).

[27]  Corwin M Zigler,et al.  Model Feedback in Bayesian Propensity Score Estimation , 2013, Biometrics.

[28]  Michael J Crowther,et al.  Using simulation studies to evaluate statistical methods , 2017, Statistics in medicine.

[29]  Joshua D. Angrist,et al.  Identification of Causal Effects Using Instrumental Variables , 1993 .

[30]  H. Chipman,et al.  Bayesian CART Model Search , 1998 .

[31]  C R Bauer,et al.  Early intervention in low-birth-weight premature infants. Results through age 5 years from the Infant Health and Development Program. , 1994 .

[32]  M C McCormick,et al.  Very low birth weight children: behavior problems and school difficulty in a national sample. , 1990, The Journal of pediatrics.

[33]  J. Brooks-Gunn,et al.  Effects of Early Intervention on Cognitive Function of Low Birth Weight Preterm Infants, , 1992, The Journal of pediatrics.

[34]  Hemant Ishwaran,et al.  Estimating Individual Treatment Effect in Observational Data Using Random Forest Methods , 2017, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America.

[35]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[36]  H. Chipman,et al.  BART: Bayesian Additive Regression Trees , 2008, 0806.3286.

[37]  A. Dawid,et al.  Statistical Causality from a Decision-Theoretic Perspective , 2014, 1405.2292.

[38]  Judea Pearl,et al.  Theoretical Impediments to Machine Learning With Seven Sparks from the Causal Revolution , 2018, WSDM.

[39]  Trevor Hastie,et al.  Some methods for heterogeneous treatment effect estimation in high dimensions , 2017, Statistics in medicine.

[40]  Uri Shalit,et al.  Learning Representations for Counterfactual Inference , 2016, ICML.

[41]  Stefan Wager,et al.  Estimation and Inference of Heterogeneous Treatment Effects using Random Forests , 2015, Journal of the American Statistical Association.

[42]  Sören R. Künzel,et al.  Meta-learners for Estimating Heterogeneous Treatment Effects using Machine Learning , 2017 .

[43]  P. Müller,et al.  Subgroup finding via Bayesian additive regression trees , 2017, Statistics in medicine.

[44]  Robert Tibshirani,et al.  A comparison of methods for model selection when estimating individual treatment effects , 2018, 1804.05146.

[45]  Corwin M Zigler,et al.  Uncertainty in Propensity Score Estimation: Bayesian Methods for Variable Selection and Model-Averaged Causal Effects , 2014, Journal of the American Statistical Association.

[46]  Yun Yang,et al.  Bayesian regression tree ensembles that adapt to smoothness and sparsity , 2017, Journal of the Royal Statistical Society: Series B (Statistical Methodology).

[47]  Mihaela van der Schaar,et al.  Bayesian Inference of Individualized Treatment Effects using Multi-task Gaussian Processes , 2017, NIPS.

[48]  A. Linero Bayesian Regression Trees for High-Dimensional Prediction and Variable Selection , 2018 .

[49]  Mihaela van der Schaar,et al.  Limits of Estimating Heterogeneous Treatment Effects: Guidelines for Practical Algorithm Design , 2018, ICML.

[50]  Carlos M. Carvalho,et al.  Targeted Smooth Bayesian Causal Forests: An analysis of heterogeneous treatment effects for simultaneous vs. interval medical abortion regimens over gestation , 2019, The Annals of Applied Statistics.

[51]  P. Richard Hahn,et al.  Bayesian Regression Tree Models for Causal Inference: Regularization, Confounding, and Heterogeneous Effects , 2017, 1706.09523.