Sequential Deconfounding for Causal Inference with Unobserved Confounders

Using observational data to estimate the effect of a treatment is a powerful tool for decisionmaking when randomized experiments are infeasible or costly. However, observational data often yields biased estimates of treatment effects, since treatment assignment can be confounded by unobserved variables. A remedy is offered by deconfounding methods that adjust for such unobserved confounders. In this paper, we develop the Sequential Deconfounder, a method that enables estimating individualized treatment effects over time in presence of unobserved confounders. This is the first deconfounding method that can be used in a general sequential setting (i. e., with one or more treatments assigned at each timestep). The Sequential Deconfounder uses a novel Gaussian process latent variable model to infer substitutes for the unobserved confounders, which are then used in conjunction with an outcome model to estimate treatment effects over time. We prove that using our method yields unbiased estimates of individualized treatment responses over time. Using simulated and real medical data, we demonstrate the efficacy of our method in deconfounding the estimation of treatment responses over time.

[1]  Aldo A. Faisal,et al.  The Artificial Intelligence Clinician learns optimal treatment strategies for sepsis in intensive care , 2018, Nature Medicine.

[2]  S. Cole,et al.  Time-modified confounding. , 2009, American journal of epidemiology.

[3]  Bryan Lim,et al.  Forecasting Treatment Responses Over Time Using Recurrent Marginal Structural Networks , 2018, NeurIPS.

[4]  Ilias Tagkopoulos,et al.  From Data to Optimal Decision Making: A Data-Driven, Probabilistic Machine Learning Approach to Decision Support for Patients With Sepsis , 2015, JMIR medical informatics.

[5]  Neil D. Lawrence,et al.  Bayesian Gaussian Process Latent Variable Model , 2010, AISTATS.

[6]  Max Welling,et al.  Causal Effect Inference with Deep Latent-Variable Models , 2017, NIPS 2017.

[7]  G. Imbens,et al.  Comment on: “The Blessings of Multiple Causes” by Yixin Wang and David M. Blei , 2019, Journal of the American Statistical Association.

[8]  Ian R White,et al.  Adjusting for partially missing baseline measurements in randomized trials , 2005, Statistics in medicine.

[9]  L. Rüschendorf On the distributional transform, Sklar's theorem, and the empirical copula process , 2009 .

[10]  Thomas Plümper,et al.  Efficient Estimation of Time-Invariant and Rarely Changing Variables in Finite Sample Panel Analyses with Unit Fixed Effects , 2007, Political Analysis.

[11]  George Hripcsak,et al.  The Medical Deconfounder: Assessing Treatment Effects with Electronic Health Records , 2019, MLHC.

[12]  Mihaela van der Schaar,et al.  Time Series Deconfounder: Estimating Treatment Effects over Time in the Presence of Hidden Confounders , 2019, ICML.

[13]  Steven L. Scott,et al.  Inferring causal impact using Bayesian structural time-series models , 2015, 1506.00356.

[14]  Suchi Saria,et al.  Learning Treatment-Response Models from Multivariate Longitudinal Data , 2017, UAI.

[15]  David M. Blei,et al.  The Deconfounded Recommender: A Causal Inference Approach to Recommendation , 2018, ArXiv.

[16]  J. Robins A new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect , 1986 .

[17]  Uri Shalit,et al.  Learning Representations for Counterfactual Inference , 2016, ICML.

[18]  Vikash K. Mansinghka,et al.  Causal Inference using Gaussian Processes with Structured Latent Confounders , 2020, ICML.

[19]  Kris K. Hauser,et al.  Artificial intelligence framework for simulating clinical decision-making: A Markov decision process approach , 2013, Artif. Intell. Medicine.

[20]  Sergey Levine,et al.  Offline policy evaluation across representations with applications to educational games , 2014, AAMAS.

[21]  Peter Szolovits,et al.  MIMIC-III, a freely accessible critical care database , 2016, Scientific Data.

[22]  O. Kallenberg Foundations of Modern Probability , 2021, Probability Theory and Stochastic Modelling.

[23]  D. Rubin Bayesianly Justifiable and Relevant Frequency Calculations for the Applied Statistician , 1984 .

[24]  Dustin Tran,et al.  Implicit Causal Models for Genome-wide Association Studies , 2017, ICLR.

[25]  Suchi Saria,et al.  Reliable Decision Support using Counterfactual Models , 2017, NIPS.

[26]  Fredrik D. Johansson,et al.  Guidelines for reinforcement learning in healthcare , 2019, Nature Medicine.

[27]  M. Sklar Fonctions de repartition a n dimensions et leurs marges , 1959 .

[28]  Stefan Wager,et al.  Estimation and Inference of Heterogeneous Treatment Effects using Random Forests , 2015, Journal of the American Statistical Association.

[29]  Neil D. Lawrence,et al.  Gaussian Process Latent Variable Models for Visualisation of High Dimensional Data , 2003, NIPS.

[30]  Eric J. Tchetgen Tchetgen,et al.  Comment on “Blessings of Multiple Causes” , 2019, Journal of the American Statistical Association.

[31]  Bernhard Schölkopf,et al.  Deconfounding Reinforcement Learning in Observational Settings , 2018, ArXiv.

[32]  Jennifer L. Hill,et al.  Assessing lack of common support in causal inference using bayesian nonparametrics: Implications for evaluating the effect of breastfeeding on children's cognitive outcomes , 2013, 1311.7244.

[33]  Stefan Feuerriegel,et al.  Early Detection of User Exits from Clickstream Data: A Markov Modulated Marked Point Process Model , 2020, WWW.

[34]  David M. Blei,et al.  The Blessings of Multiple Causes , 2018, Journal of the American Statistical Association.

[35]  Mihaela van der Schaar,et al.  Bayesian Inference of Individualized Treatment Effects using Multi-task Gaussian Processes , 2017, NIPS.

[36]  Mihaela van der Schaar,et al.  Estimating Counterfactual Treatment Outcomes over Time Through Adversarially Balanced Representations , 2020, ICLR.

[37]  J. Robins,et al.  Marginal Structural Models and Causal Inference in Epidemiology , 2000, Epidemiology.

[38]  Stefan Feuerriegel,et al.  AttDMM: An Attentive Deep Markov Model for Risk Scoring in Intensive Care Units , 2021, KDD.

[39]  Brian T. Denton,et al.  Markov decision processes for screening and treatment of chronic diseases , 2017 .

[40]  Zhichao Jiang,et al.  Discussion of "The Blessings of Multiple Causes" by Wang and Blei , 2019, 1910.06991.

[41]  Uri Shalit,et al.  Estimating individual treatment effect: generalization bounds and algorithms , 2016, ICML.

[42]  Donald B. Rubin,et al.  Bayesian Inference for Causal Effects: The Role of Randomization , 1978 .

[43]  Kieran R. Campbell,et al.  Decomposing feature-level variation with Covariate Gaussian Process Latent Variable Models , 2018, ICML.

[44]  Stefan Feuerriegel,et al.  Estimating Average Treatment Effects via Orthogonal Regularization , 2021, CIKM.

[45]  M. Robins James,et al.  Estimation of the causal effects of time-varying exposures , 2008 .

[46]  Alexander D'Amour,et al.  On Multi-Cause Approaches to Causal Inference with Unobserved Counfounding: Two Cautionary Failure Cases and A Promising Alternative , 2019, AISTATS.

[47]  N. Adler,et al.  Patients in context--EHR capture of social and behavioral determinants of health. , 2015, The New England journal of medicine.

[48]  Kitty S. Chan,et al.  Review: Electronic Health Records and the Reliability and Validity of Quality Measures: A Review of the Literature , 2010, Medical care research and review : MCRR.

[49]  Neil D. Lawrence,et al.  Gaussian Processes for Big Data , 2013, UAI.

[50]  Mihaela van der Schaar,et al.  Nonparametric Estimation of Heterogeneous Treatment Effects: From Theory to Learning Algorithms , 2021, AISTATS.

[51]  David M. Blei,et al.  The Blessings of Multiple Causes: Rejoinder , 2019, Journal of the American Statistical Association.

[52]  Michael N Cantor,et al.  Integrating Data On Social Determinants Of Health Into Electronic Health Records. , 2018, Health affairs.