Treatment effect estimation with disentangled latent factors

A common challenge of many scientific studies is to determine whether a treatment is effective for an outcome. When considering a binary treatment, this problem can be addressed by estimating the average treatment effect using the potential outcome framework. Moreover, since different individuals often respond differently to the same treatment due to their distinct characteristics. In order to understand the heterogeneous treatment effect for different individuals, practitioners need to estimate the conditional average treatment effects conditioning on the variables describing the distinct characteristics of individuals. Much research has been devoted to the estimation of treatment effects from observational data; however, most of them assume that the set of observed variables contains exactly all the confounders that affect both the treatment and the outcome. Unfortunately, this assumption is frequently violated in real-world applications not only because some of the observed variables only affect the treatment or the outcome, but also due to the fact that in many cases only the proxy variables of the underlying confounding factors can be observed. In this work, we first show the importance of differentiating confounding factors from instrumental and risk factors for average and conditional average treatment effect estimation, and then we propose a variational inference approach to simultaneously infer latent factors from the observed variables and disentangle the factors into three disjoint sets corresponding to the instrumental, confounding, and risk factors. Experimental results demonstrate the effectiveness of the proposed method on synthetic, benchmark, and real-world datasets for treatment effect estimation.

[1]  Negar Hassanpour,et al.  CounterFactual Regression with Importance Sampling Weights , 2019, IJCAI.

[2]  Uri Shalit,et al.  Learning Representations for Counterfactual Inference , 2016, ICML.

[3]  Pharmacoepidemiology and drug safety , 1996, Pharmacoepidemiology and drug safety.

[4]  Zhi-Hua Zhou,et al.  Mining heterogeneous causal effects for personalized cancer treatment , 2017, Bioinform..

[5]  Guido W. Imbens,et al.  Potential Outcome and Directed Acyclic Graph Approaches to Causality: Relevance for Empirical Practice in Economics , 2019, Journal of Economic Literature.

[6]  Aidong Zhang,et al.  Representation Learning for Treatment Effect Estimation from Observational Data , 2018, NeurIPS.

[7]  Max Welling,et al.  Causal Effect Inference with Deep Latent-Variable Models , 2017, NIPS 2017.

[8]  김삼묘,et al.  “Bioinformatics” 특집을 내면서 , 2000 .

[9]  Stefan Wager,et al.  Estimation and Inference of Heterogeneous Treatment Effects using Random Forests , 2015, Journal of the American Statistical Association.

[10]  Jennifer Hill,et al.  Automated versus Do-It-Yourself Methods for Causal Inference: Lessons Learned from a Data Analysis Competition , 2017, Statistical Science.

[11]  R. Lalonde Evaluating the Econometric Evaluations of Training Programs with Experimental Data , 1984 .

[12]  Mihaela van der Schaar,et al.  Validating Causal Inference Models via Influence Functions , 2019, ICML.

[13]  Mihaela van der Schaar,et al.  GANITE: Estimation of Individualized Treatment Effects using Generative Adversarial Nets , 2018, ICLR.

[14]  Uri Shalit,et al.  Estimating individual treatment effect: generalization bounds and algorithms , 2016, ICML.

[15]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[16]  Szymon Jaroszewicz,et al.  Decision trees for uplift modeling with single and multiple treatments , 2011, Knowledge and Information Systems.

[17]  Jason Roy,et al.  A review of covariate selection for non‐experimental comparative effectiveness research , 2013, Pharmacoepidemiology and drug safety.

[18]  Hansheng Wang,et al.  Subgroup Analysis via Recursive Partitioning , 2009, J. Mach. Learn. Res..

[19]  G. Imbens,et al.  Large Sample Properties of Matching Estimators for Average Treatment Effects , 2004 .

[20]  D. Almond,et al.  The Costs of Low Birth Weight , 2004 .

[21]  I. Light The Collaborative Perinatal Study of the National Institute of Neurological Diseases and Stroke: The Women and Their Pregnancies. , 1973 .

[22]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .

[23]  Illtyd Trethowan Causality , 1938 .

[24]  Appendix to “ Bayesian Mixtures of Autoregressive Models ” published in the Journal of Computational and Graphical Statistics , 2010 .

[25]  Christopher Burgess,et al.  beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework , 2016, ICLR 2016.

[26]  Bernhard Schölkopf,et al.  Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations , 2018, ICML.

[27]  J. Hahn On the Role of the Propensity Score in Efficient Semiparametric Estimation of Average Treatment Effects , 1998 .

[28]  Mihaela van der Schaar,et al.  Deep-Treat: Learning Optimal Personalized Treatments From Observational Data Using Neural Networks , 2018, AAAI.

[29]  O. William Journal Of The American Statistical Association V-28 , 1932 .

[30]  Michael I. Jordan,et al.  Advances in Neural Information Processing Systems 30 , 1995 .

[31]  Lin Liu,et al.  Estimating heterogeneous treatment effect by balancing heterogeneity and fitness , 2018, BMC Bioinformatics.

[32]  Negar Hassanpour,et al.  Learning Disentangled Representations for CounterFactual Regression , 2020, ICLR.

[33]  Pierre Gutierrez,et al.  Causal Inference and Uplift Modelling: A Review of the Literature , 2017, PAPIs.

[34]  Vaishak Belle,et al.  Proceedings of The Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17) , 2017, AAAI 2017.

[35]  Susan Athey,et al.  Recursive partitioning for heterogeneous causal effects , 2015, Proceedings of the National Academy of Sciences.

[36]  Jennifer L. Hill,et al.  Bayesian Nonparametric Modeling for Causal Inference , 2011 .

[37]  F. A. Hayek The American Economic Review , 2007 .

[38]  Sören R. Künzel,et al.  Metalearners for estimating heterogeneous treatment effects using machine learning , 2017, Proceedings of the National Academy of Sciences.

[39]  Bo Li,et al.  Treatment Effect Estimation with Data-Driven Variable Decomposition , 2017, AAAI.

[40]  E. Kandel,et al.  Proceedings of the National Academy of Sciences of the United States of America. Annual subject and author indexes. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[41]  K. Pearson,et al.  Biometrika , 1902, The American Naturalist.

[42]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[43]  Mihaela van der Schaar,et al.  Limits of Estimating Heterogeneous Treatment Effects: Guidelines for Practical Algorithm Design , 2018, ICML.

[44]  Jenny Häggström,et al.  Data‐driven confounder selection via Markov and Bayesian networks , 2016, Biometrics.