Estimating Individual Treatment Effects through Causal Populations Identification

Estimating the Individual Treatment Effect from observational data, defined as the difference between outcomes with and without treatment or intervention, while observing just one of both, is a challenging problems in causal learning. In this paper, we formulate this problem as an inference from hidden variables and enforce causal constraints based on a model of four exclusive causal populations. We propose a new version of the EM algorithm, coined as Expected-Causality-Maximization (ECM) algorithm and provide hints on its convergence under mild conditions. We compare our algorithm to baseline methods on synthetic and real-world data and discuss its performances.

[1]  Uri Shalit,et al.  Learning Representations for Counterfactual Inference , 2016, ICML.

[2]  Szymon Jaroszewicz,et al.  Support Vector Machines for Uplift Modeling , 2013, 2013 IEEE 13th International Conference on Data Mining Workshops.

[3]  M. Frölich Programme Evaluation with Multiple Treatments , 2002, SSRN Electronic Journal.

[4]  Stefan Wager,et al.  Estimation and Inference of Heterogeneous Treatment Effects using Random Forests , 2015, Journal of the American Statistical Association.

[5]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[6]  Larry Wasserman,et al.  All of Statistics: A Concise Course in Statistical Inference , 2004 .

[7]  D. Rubin,et al.  Bayesian inference for causal effects in randomized experiments with noncompliance , 1997 .

[8]  D. Rubin Estimating causal effects of treatments in randomized and nonrandomized studies. , 1974 .

[9]  S. Jaroszewicz,et al.  Uplift modeling for clinical trial data , 2012 .

[10]  Jennifer L. Hill,et al.  Bayesian Nonparametric Modeling for Causal Inference , 2011 .

[11]  Mihaela van der Schaar,et al.  GANITE: Estimation of Individualized Treatment Effects using Generative Adversarial Nets , 2018, ICLR.

[12]  Mihaela van der Schaar,et al.  Bayesian Inference of Individualized Treatment Effects using Multi-task Gaussian Processes , 2017, NIPS.

[13]  Thierry Denoeux,et al.  Learning from partially supervised data using mixture models and belief functions , 2009, Pattern Recognit..

[14]  Uri Shalit,et al.  Estimating individual treatment effect: generalization bounds and algorithms , 2016, ICML.

[15]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[16]  Gérard Govaert,et al.  EM Algorithm for Partially Known Labels , 2000 .

[17]  Masashi Sugiyama,et al.  Uplift Modeling from Separate Labels , 2018, NeurIPS.

[18]  Max Welling,et al.  Causal Effect Inference with Deep Latent-Variable Models , 2017, NIPS 2017.

[19]  Mihaela van der Schaar,et al.  Limits of Estimating Heterogeneous Treatment Effects: Guidelines for Practical Algorithm Design , 2018, ICML.