UvA-DARE (Digital Academic Repository) Causal Effect Inference with Deep Latent-Variable Models

Learning individual-level causal effects from observational data, such as inferring the most effective medication for a specific patient, is a problem of growing importance for policy makers. The most important aspect of inferring causal effects from observational data is the handling of confounders, factors that affect both an intervention and its outcome. A carefully designed observational study attempts to measure all important confounders. However, even if one does not have direct access to all confounders, there may exist noisy and uncertain measurement of proxies for confounders. We build on recent advances in latent variable modeling to simultaneously estimate the unknown latent space summarizing the confounders and the causal effect. Our method is based on Variational Autoencoders (VAE) which follow the causal structure of inference with proxies. We show our method is significantly more robust than existing methods, and matches the state-of-the-art on previous benchmarks focused on individual treatment effects.

[1]  Z. Geng,et al.  Identifying Causal Effects With Proxy Variables of an Unmeasured Confounder. , 2016, Biometrika.

[2]  Stefan Wager,et al.  Estimation and Inference of Heterogeneous Treatment Effects using Random Forests , 2015, Journal of the American Statistical Association.

[3]  Sanjeev Arora,et al.  Provable learning of noisy-OR networks , 2016, STOC.

[4]  Max Welling,et al.  Improved Variational Inference with Inverse Autoregressive Flow , 2016, NIPS 2016.

[5]  Uri Shalit,et al.  Estimating individual treatment effect: generalization bounds and algorithms , 2016, ICML.

[6]  Alexander Peysakhovich,et al.  Combining observational and experimental data to find heterogeneous treatment effects , 2016, ArXiv.

[7]  Dustin Tran,et al.  Edward: A library for probabilistic modeling, inference, and criticism , 2016, ArXiv.

[8]  Dustin Tran,et al.  Operator Variational Inference , 2016, NIPS.

[9]  Max Jaderberg,et al.  Unsupervised Learning of 3D Structure from Images , 2016, NIPS.

[10]  Uri Shalit,et al.  Learning Representations for Counterfactual Inference , 2016, ICML.

[11]  Ole Winther,et al.  Auxiliary Deep Generative Models , 2016, ICML.

[12]  Sepp Hochreiter,et al.  Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) , 2015, ICLR.

[13]  Dustin Tran,et al.  Variational Gaussian Process , 2015, ICLR.

[14]  Max Welling,et al.  The Variational Fair Autoencoder , 2015, ICLR.

[15]  Yoshua Bengio,et al.  A Recurrent Latent Variable Model for Sequential Data , 2015, NIPS.

[16]  Shakir Mohamed,et al.  Variational Inference with Normalizing Flows , 2015, ICML.

[17]  Stephen R Cole,et al.  All your data are always missing: incorporating bias due to measurement error into the potential outcomes framework. , 2015, International journal of epidemiology.

[18]  Alex Graves,et al.  DRAW: A Recurrent Neural Network For Image Generation , 2015, ICML.

[19]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[20]  J. Pearl,et al.  Measurement bias and effect restoration in causal inference , 2014 .

[21]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[22]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[23]  Anima Anandkumar,et al.  Tensor decompositions for learning latent variable models , 2012, J. Mach. Learn. Res..

[24]  J. Wooldridge Introduction to Econometrics , 2013 .

[25]  David Sontag,et al.  Discovering Hidden Variables in Noisy-Or Networks using Quartet Tests , 2013, NIPS.

[26]  J. Pearl Detecting Latent Heterogeneity , 2013, Probabilistic and Causal Inference.

[27]  Anima Anandkumar,et al.  A Method of Moments for Mixture Models and Hidden Markov Models , 2012, COLT.

[28]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[29]  Jennifer L. Hill,et al.  Bayesian Nonparametric Modeling for Causal Inference , 2011 .

[30]  L. Pritchett,et al.  Estimating Wealth Effects Without Expenditure Data—Or Tears: An Application To Educational Enrollments In States Of India* , 2001, Demography.

[31]  Alexander Kukush,et al.  Measurement Error Models , 2011, International Encyclopedia of Statistical Science.

[32]  Sander Greenland,et al.  Bias Analysis , 2011, International Encyclopedia of Statistical Science.

[33]  M. Montgomery,et al.  Measuring living standards with proxy variables , 2011, Demography.

[34]  Judea Pearl,et al.  On Measurement Bias in Causal Inference , 2010, UAI.

[35]  J. Wooldridge On estimating firm-level production functions using proxy variables to control for unobservables , 2009 .

[36]  S. Kolenikov,et al.  Socioeconomic Status Measurement with Discrete Proxy Variables: Is Principal Component Analysis a Reliable Answer? , 2009 .

[37]  Sham M. Kakade,et al.  A spectral algorithm for learning Hidden Markov Models , 2008, J. Comput. Syst. Sci..

[38]  C. Matias,et al.  Identifiability of parameters in latent structure models with many observed variables , 2008, 0809.5032.

[39]  Joshua D. Angrist,et al.  Mostly Harmless Econometrics: An Empiricist's Companion , 2008 .

[40]  Manabu Kuroki,et al.  On Identifying Total Effects in the Presence of Latent Variables and Selection bias , 2008, UAI.

[41]  Sanjeev Arora,et al.  LEARNING MIXTURES OF SEPARATED NONSPHERICAL GAUSSIANS , 2005, math/0503457.

[42]  D. Almond,et al.  The Costs of Low Birth Weight , 2004 .

[43]  Jeffrey A. Smith,et al.  Does Matching Overcome Lalonde's Critique of Nonexperimental Estimators? , 2000 .

[44]  Bo Thiesson,et al.  Learning Mixtures of DAG Models , 1998, UAI.

[45]  J. Selen Adjusting for errors in classification and measurement in the analysis of partly and purely categorical data , 1986 .

[46]  R. Lalonde Evaluating the Econometric Evaluations of Training Programs with Experimental Data , 1984 .

[47]  Jerry A. Hausman,et al.  Errors in Variables in Panel Data , 1984 .

[48]  S. Greenland,et al.  Correcting for misclassification in two-way tables and matched-pair studies. , 1983, International journal of epidemiology.

[49]  P. Frost Proxy Variables and Specification Bias , 1979 .

[50]  J. Kruskal More factors than subjects, tests and treatments: An indeterminacy theorem for canonical decomposition and individual differences scaling , 1976 .

[51]  L. A. Goodman Exploratory latent structure analysis using both identifiable and unidentifiable models , 1974 .

[52]  M. Wickens A Note on the Use of Proxy Variables , 1972 .

[53]  Illtyd Trethowan Causality , 1938 .