Learning Decomposed Representation for Counterfactual Inference

One fundamental problem in the learning treatment effect from observational data is confounder identification and balancing. Most of the previous methods realized confounder balancing by treating all observed variables as confounders, ignoring the identification of confounders and non-confounders. In general, not all the observed variables are confounders which are the common causes of both the treatment and the outcome, some variables only contribute to the treatment and some contribute to the outcome. Balancing those non-confounders would generate additional bias for treatment effect estimation. By modeling the different relations among variables, treatment and outcome, we propose a synergistic learning framework to 1) identify and balance confounders by learning decomposed representation of confounders and non-confounders, and simultaneously 2) estimate the treatment effect in observational studies via counterfactual inference. Our empirical results demonstrate that the proposed method can precisely identify and balance confounders, while the estimation of the treatment effect performs better than the state-of-the-art methods on both synthetic and real-world datasets.

[1]  Illtyd Trethowan Causality , 1938 .

[2]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .

[3]  R. Lalonde Evaluating the Econometric Evaluations of Training Programs with Experimental Data , 1984 .

[4]  P. Holland Statistics and Causal Inference , 1985 .

[5]  P. Rosenbaum Model-Based Direct Adjustment , 1987 .

[6]  A. Müller Integral Probability Metrics and Their Generating Classes of Functions , 1997, Advances in Applied Probability.

[7]  Jeffrey A. Smith,et al.  Does Matching Overcome Lalonde's Critique of Nonexperimental Estimators? , 2000 .

[8]  D. Almond,et al.  The Costs of Low Birth Weight , 2004 .

[9]  J. Robins,et al.  Doubly Robust Estimation in Missing Data and Causal Inference Models , 2005, Biometrics.

[10]  J. Avorn,et al.  Variable selection for propensity score models. , 2006, American journal of epidemiology.

[11]  Kenji Fukumizu,et al.  On integral probability metrics, φ-divergences and binary classification , 2009, 0901.2698.

[12]  Yishay Mansour,et al.  Domain Adaptation: Learning Bounds and Algorithms , 2009, COLT.

[13]  J. Pearl Causal inference in statistics: An overview , 2009 .

[14]  Judea Pearl,et al.  On a Class of Bias-Amplifying Variables that Endanger Effect Estimates , 2010, UAI.

[15]  Jennifer L. Hill,et al.  Bayesian Nonparametric Modeling for Causal Inference , 2011 .

[16]  P. Austin An Introduction to Propensity Score Methods for Reducing the Effects of Confounding in Observational Studies , 2011, Multivariate behavioral research.

[17]  J. Myers,et al.  Effects of adjusting for instrumental variables on bias and precision of effect estimates. , 2011, American journal of epidemiology.

[18]  I. Shpitser,et al.  A New Criterion for Confounder Selection , 2011, Biometrics.

[19]  Ron Kohavi,et al.  Unexpected results in online controlled experiments , 2011, SKDD.

[20]  Yoshua Bengio,et al.  Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..

[21]  Bernhard Schölkopf,et al.  A Kernel Two-Sample Test , 2012, J. Mach. Learn. Res..

[22]  Jens Hainmueller,et al.  Entropy Balancing for Causal Effects: A Multivariate Reweighting Method to Produce Balanced Samples in Observational Studies , 2012, Political Analysis.

[23]  Joaquin Quiñonero Candela,et al.  Counterfactual reasoning and learning systems: the example of computational advertising , 2013, J. Mach. Learn. Res..

[24]  J. Zubizarreta Stable Weights that Balance Covariates for Estimation With Incomplete Outcome Data , 2015 .

[25]  D. Rubin,et al.  Causal Inference for Statistics, Social, and Biomedical Sciences: A General Method for Estimating Sampling Variances for Standard Estimators for Average Causal Effects , 2015 .

[26]  Uri Shalit,et al.  Learning Representations for Counterfactual Inference , 2016, ICML.

[27]  Yun Fu,et al.  Matching via Dimensionality Reduction for Estimation of Treatment Effects in Digital Marketing Campaigns , 2016, IJCAI.

[28]  G. Imbens,et al.  Approximate residual balancing: debiased inference of average treatment effects in high dimensions , 2016, 1604.07125.

[29]  Uri Shalit,et al.  Estimating individual treatment effect: generalization bounds and algorithms , 2016, ICML.

[30]  Léon Bottou,et al.  Wasserstein GAN , 2017, ArXiv.

[31]  Bo Li,et al.  Treatment Effect Estimation with Data-Driven Variable Decomposition , 2017, AAAI.

[32]  Kevin Leyton-Brown,et al.  Deep IV: A Flexible Approach for Counterfactual Prediction , 2017, ICML.

[33]  Aidong Zhang,et al.  Representation Learning for Treatment Effect Estimation from Observational Data , 2018, NeurIPS.

[34]  Negar Hassanpour,et al.  CounterFactual Regression with Importance Sampling Weights , 2019, IJCAI.

[35]  T. VanderWeele Principles of confounder selection , 2019, European Journal of Epidemiology.

[36]  Negar Hassanpour,et al.  Learning Disentangled Representations for CounterFactual Regression , 2020, ICLR.

[37]  M. Schaar,et al.  Learning Overlapping Representations for the Estimation of Individualized Treatment Effects , 2020, AISTATS.

[38]  Kun Kuang,et al.  Treatment Effect Estimation via Differentiated Confounder Balancing and Regression , 2019, ACM Trans. Knowl. Discov. Data.

[39]  Aidong Zhang,et al.  A Survey on Causal Inference , 2020, ACM Trans. Knowl. Discov. Data.