Modeling heterogeneous treatment effects in large-scale experiments using Bayesian Additive Regression Trees

We present a methodology that largely automates the search for systematic treatment effect heterogeneity in large-scale experiments. We introduce a nonparametric estimator developed in statistical learning, Bayesian Additive Regression Trees (BART), to model treatment effects that vary as a function of covariates. BART has several advantages over commonly employed parametric modeling strategies, in particular its ability to automatically detect and model relevant treatment-covariate interactions in a flexible manner. To increase the reliability and credibility of the resulting conditional treatment effect estimates, we suggest the use of a split sample design. The data are randomly divided into two equally-sized parts, with the first part used to explore treatment effect heterogeneity and the second part used to confirm the results. This approach permits a relatively unstructured data-driven exploration of treatment effect heterogeneity while avoiding charges of data dredging and mitigating multiple comparison problems. We illustrate the value of our approach by offering two empirical examples, a survey experiment on Americans support for social welfare spending and a voter mobilization field experiment. In both applications, BART provides robust insights into the nature of systematic treatment effect heterogeneity.

[1]  I NICOLETTI,et al.  The Planning of Experiments , 1936, Rivista di clinica pediatrica.

[2]  J. Morgan,et al.  Problems in the Analysis of Survey Data, and a Proposal , 1963 .

[3]  D. Rubin Estimating causal effects of treatments in randomized and nonrandomized studies. , 1974 .

[4]  J. Williamson Beliefs About the Motivation of the Poor and Attitudes Toward Poverty Policy , 1974 .

[5]  D. Cox A note on data-splitting for the evaluation of significance levels , 1975 .

[6]  Donald B. Rubin,et al.  Bayesian Inference for Causal Effects: The Role of Randomization , 1978 .

[7]  D P Byar,et al.  Assessing apparent treatment--covariate interactions in randomized clinical trials. , 1985, Statistics in medicine.

[8]  P. Holland Statistics and Causal Inference , 1985 .

[9]  Tom W. Smith THAT WHICH WE CALL WELFARE BY ANY OTHER NAME WOULD SMELL SWEETER AN ANALYSIS OF THE IMPACT OF QUESTION WORDING ON RESPONSE PATTERNS , 1987 .

[10]  M. Braga,et al.  Exploratory Data Analysis , 2018, Encyclopedia of Social Network Analysis and Mining. 2nd Ed..

[11]  K. Rasinski,et al.  THE EFFECT OF QUESTION WORDING ON PUBLIC SUPPORT FOR GOVERNMENT SPENDING , 1989 .

[12]  Robert C. Luskin Explaining political sophistication , 1990 .

[13]  R. Simon,et al.  Bayesian subset analysis. , 1991, Biometrics.

[14]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[15]  J. Heckman,et al.  Making the Most out of Programme Evaluations and Social Experiments: Accounting for Heterogeneity in Programme Impacts , 1997 .

[16]  H. Chipman,et al.  Bayesian CART Model Search , 1998 .

[17]  William G. Jacoby Issue Framing and Public Opinion on Government Spending , 2000 .

[18]  Joshua Miller Why Americans Hate Welfare: Race, Media, and the Politics of Antipoverty Policy , 2000 .

[19]  D. Poirier,et al.  On the Predictive Distributions of Outcome Gains in the Presence of an Unidentified Parameter , 2003 .

[20]  S. Pocock,et al.  Subgroup analysis, covariate adjustment and baseline comparisons in clinical trial reporting: current practiceand problems , 2002, Statistics in medicine.

[21]  G. Shaw,et al.  Trends: Poverty and Public Assistance , 2002 .

[22]  Rajeev H. Dehejiaa,et al.  Program evaluation as a decision problem , 2002 .

[23]  J. Angrist Treatment Effect Heterogeneity in Theory and Practice , 2003 .

[24]  H. Bullock,et al.  Predicting Support for Welfare Policies: The Impact of Attributions and Beliefs About Inequality , 2003 .

[25]  Alberto Abadie Semiparametric instrumental variable estimation of treatment response models , 2003 .

[26]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[27]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[28]  P. Royston,et al.  A new approach to modelling interactions between treatment and continuous covariates in clinical trials by using fractional polynomials , 2004, Statistics in medicine.

[29]  B. Weiner,et al.  Hate Welfare But Help the Poor: How the Attributional Content of Stereotypes Explains the Paradox of Reactions to the Destitute in America1 , 2004 .

[30]  Christopher M. Federico When Do Welfare Attitudes Become Racialized? The Paradoxical Effects of Education , 2004 .

[31]  P. Rothwell Subgroup analysis in randomised controlled trials: importance, indications, and interpretation , 2005, The Lancet.

[32]  Keying Ye,et al.  Applied Bayesian Modeling and Causal Inference From Incomplete-Data Perspectives , 2005, Technometrics.

[33]  Andrew Gelman,et al.  Treatment Effects in Before‐After Data , 2005 .

[34]  James H. Kuklinski,et al.  The Growth and Development of Experimental Research in Political Science , 2006, American Political Science Review.

[35]  Kosuke Imai,et al.  Designing and Analyzing Randomized Experiments: Application to a Japanese Election Survey Experiment , 2007 .

[36]  J. Heckman,et al.  Econometric Evaluation of Social Programs, Part III: Distributional Treatment Effects, Dynamic Treatment Effects, Dynamic Discrete Choice, and General Equilibrium Policy Evaluation , 2007 .

[37]  Richard K. Crump,et al.  Nonparametric Tests for Treatment Effect Heterogeneity , 2006, The Review of Economics and Statistics.

[38]  Christopher W. Larimer,et al.  Social Pressure and Voter Turnout: Evidence from a Large-Scale Field Experiment , 2008, American Political Science Review.

[39]  Jeffrey A. Smith,et al.  Heterogeneous Impacts in PROGRESA , 2008, SSRN Electronic Journal.

[40]  Alan Julian Izenman,et al.  Modern Multivariate Statistical Techniques: Regression, Classification, and Manifold Learning , 2008 .

[41]  Rebecca A. Betensky,et al.  Simultaneous confidence intervals based on the percentile bootstrap approach , 2008, Comput. Stat. Data Anal..

[42]  Jonah B. Gelbach,et al.  Can Constant Treatment Eects Within Subgroup Explain Heterogeneity in Welfare Reform Eects , 2008 .

[43]  Shalabh Statistical Learning from a Regression Perspective , 2009 .

[44]  Dylan S. Small,et al.  Split Samples and Design Sensitivity in Observational Studies , 2009 .

[45]  Kosuke Imai,et al.  Experimental Identification of Causal Mechanisms , 2009 .

[46]  Richard L Kravitz,et al.  Dealing with heterogeneity of treatment effects: is the literature up to the challenge? , 2009, Trials.

[47]  A. Feller,et al.  Beyond Toplines : Heterogeneous Treatment Effects in Randomized Experiments ∗ , 2009 .

[48]  Donald P. Green,et al.  Detecting Social Networks: Design and Analysis of Multilevel Experiments , 2010 .

[49]  D. Green,et al.  An Experiment Testing the Relative Effectiveness of Encouraging Voter Participation by Inducing Feelings of Pride or Shame , 2010 .

[50]  Christopher B. Mann,et al.  Is There Backlash to Social Pressure? A Large-scale Field Experiment on Voter Mobilization , 2010 .

[51]  H. Chipman,et al.  BART: Bayesian Additive Regression Trees , 2008, 0806.3286.

[52]  William D. Berry,et al.  Testing for Interaction in Binary Logit and Probit Models: Is a Product Term Essential? , 2010 .

[53]  Jennifer L. Hill,et al.  Bayesian Nonparametric Modeling for Causal Inference , 2011 .

[54]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[55]  Marisa A. Abrajano,et al.  Does Language Matter? The Impact of Spanish Versus English-Language GOTV Efforts on Latino Turnout , 2011 .