Bayesian federated estimation of causal effects from observational data

We propose a Bayesian framework for estimating causal effects from federated observational data sources. Bayesian causal inference is an important approach to learning the distribution of the causal estimands and understanding the uncertainty of causal effects. Our framework estimates the posterior distributions of the causal effects to compute the higher-order statistics that capture the uncertainty. We integrate local causal effects from different data sources without centralizing them. We then estimate the treatment effects from observational data using a non-parametric reformulation of the classical potential outcomes framework. We model the potential outcomes as a random function distributed by Gaussian processes, with defining parameters that can be efficiently learned from multiple data sources. Our method avoids exchang-ing raw data among the sources, thus contributing towards privacy-preserving causal learning. The promise of our approach is demonstrated through a set of simulated and real-world examples.

[1]  Ignavier Ng,et al.  Towards Federated Bayesian Network Structure Learning with Continuous Optimization , 2021, AISTATS.

[2]  Colin B. Compas,et al.  Federated Learning used for predicting outcomes in SARS-COV-2 patients , 2021, Research square.

[3]  David P. Woodruff,et al.  Revisiting the Sample Complexity of Sparse Spectrum Approximation of Gaussian Processes , 2020, NeurIPS.

[4]  Soo-Yong Shin,et al.  Federated Learning on Clinical Benchmark Data: Performance Assessment , 2020, Journal of medical Internet research.

[5]  Vladimir Joukov,et al.  Fast Approximate Multioutput Gaussian Processes , 2020, IEEE Intelligent Systems.

[6]  Riccardo Miotto,et al.  Federated Learning of Electronic Health Records Improves Mortality Prediction in Patients Hospitalized with COVID-19 , 2020, medRxiv.

[7]  Spyridon Bakas,et al.  Federated learning in medicine: facilitating multi-institutional collaborations without sharing patient data , 2020, Scientific Reports.

[8]  Han Yu,et al.  Privacy-Preserving Technology to Help Millions of People: Federated Prediction Model for Stroke Prevention , 2020, ArXiv.

[9]  Micah J. Sheller,et al.  The future of digital health with federated learning , 2020, npj Digital Medicine.

[10]  Aris Gkoulalas-Divanis,et al.  Predicting Adverse Drug Reactions on Distributed Health Data using Federated Learning , 2020, AMIA.

[11]  Jeong-Yoon Lee,et al.  CausalML: Python Package for Causal Machine Learning , 2020, ArXiv.

[12]  Mihaela van der Schaar,et al.  Estimating Counterfactual Treatment Outcomes over Time Through Adversarially Balanced Representations , 2020, ICLR.

[13]  Felipe A. Tobar,et al.  MOGPTK: The Multi-Output Gaussian Process Toolkit , 2020, Neurocomputing.

[14]  Shandian Zhe,et al.  Scalable High-Order Gaussian Process Regression , 2019, AISTATS.

[15]  Klaus-Robert Müller,et al.  Robust and Communication-Efficient Federated Learning From Non-i.i.d. Data , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[16]  Mihaela van der Schaar,et al.  Time Series Deconfounder: Estimating Treatment Effects over Time in the Presence of Hidden Confounders , 2019, ICML.

[17]  Mehryar Mohri,et al.  Agnostic Federated Learning , 2019, ICML.

[18]  Mauricio A. Álvarez,et al.  Non-linear process convolutions for multi-output Gaussian processes , 2018, AISTATS.

[19]  Tony Lancaster,et al.  A Bayesian procedure for estimating the causal effects of nursing home bed‐hold policy , 2018, Biostatistics.

[20]  Toniann Pitassi,et al.  Fairness through Causal Awareness: Learning Causal Latent-Variable Models for Biased Data , 2018, FAT.

[21]  Mihaela van der Schaar,et al.  Limits of Estimating Heterogeneous Treatment Effects: Guidelines for Practical Algorithm Design , 2018, ICML.

[22]  Zhiwei Steven Wu,et al.  Orthogonal Random Forest for Causal Inference , 2018, ICML.

[23]  Neil D. Lawrence,et al.  Differentially Private Regression with Gaussian Processes , 2018, AISTATS.

[24]  Mihaela van der Schaar,et al.  GANITE: Estimation of Individualized Treatment Effects using Generative Adversarial Nets , 2018, ICLR.

[25]  Bo Ning,et al.  Bayesian Method for Causal Inference in Spatially-Correlated Multivariate Time Series , 2018, Bayesian Analysis.

[26]  Xinkun Nie,et al.  Quasi-oracle estimation of heterogeneous treatment effects , 2017, Biometrika.

[27]  Trevor Hastie,et al.  Some methods for heterogeneous treatment effect estimation in high dimensions , 2017, Statistics in medicine.

[28]  Sören R. Künzel,et al.  Metalearners for estimating heterogeneous treatment effects using machine learning , 2017, Proceedings of the National Academy of Sciences.

[29]  Mihaela van der Schaar,et al.  Bayesian Inference of Individualized Treatment Effects using Multi-task Gaussian Processes , 2017, NIPS.

[30]  Kian Hsiang Low,et al.  A Generalized Stochastic Variational Bayesian Hyperparameter Learning Framework for Sparse Spectrum Gaussian Process Regression , 2016, AAAI.

[31]  J. Pearl,et al.  Causal inference and the data-fusion problem , 2016, Proceedings of the National Academy of Sciences.

[32]  Uri Shalit,et al.  Estimating individual treatment effect: generalization bounds and algorithms , 2016, ICML.

[33]  Blaise Agüera y Arcas,et al.  Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[34]  Mikhail Belkin,et al.  Learning privately from multiparty data , 2016, ICML.

[35]  Talbot Denis,et al.  The Bayesian Causal Effect Estimation Algorithm , 2015 .

[36]  Vitaly Shmatikov,et al.  Privacy-preserving deep learning , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[37]  Matt Taddy,et al.  Heterogeneous Treatment Effects in Digital Experimentation , 2014, 1412.8563.

[38]  Neil D. Lawrence,et al.  Gaussian Processes for Big Data , 2013, UAI.

[39]  Elias Bareinboim,et al.  Meta-Transportability of Causal Effects: A Formal Approach , 2013, AISTATS.

[40]  Elias Bareinboim,et al.  Causal Transportability with Limited Experiments , 2013, AAAI.

[41]  M. Daniels,et al.  Bayesian Inference for the Causal Effect of Mediation , 2012, Biometrics.

[42]  D. Green,et al.  Modeling Heterogeneous Treatment Effects in Survey Experiments with Bayesian Additive Regression Trees , 2012 .

[43]  Arun Rajkumar,et al.  A Differentially Private Stochastic Gradient Descent Algorithm for Multiparty Classification , 2012, AISTATS.

[44]  Elias Bareinboim,et al.  Transportability of Causal and Statistical Relations: A Formal Approach , 2011, 2011 IEEE 11th International Conference on Data Mining Workshops.

[45]  Jennifer L. Hill,et al.  Bayesian Nonparametric Modeling for Causal Inference , 2011 .

[46]  Bhiksha Raj,et al.  Multiparty Differential Privacy via Aggregation of Locally Trained Classifiers , 2010, NIPS.

[47]  D. Rubin,et al.  Bayesian inference for causal effects in randomized experiments with noncompliance , 1997 .

[48]  J. Pearl Causal diagrams for empirical research , 1995 .

[49]  D. Rubin [On the Application of Probability Theory to Agricultural Experiments. Essay on Principles. Section 9.] Comment: Neyman (1923) and Causal Inference in Experiments and Observational Studies , 1990 .

[50]  D. Rubin Assignment to Treatment Group on the Basis of a Covariate , 1976 .

[51]  D. Rubin INFERENCE AND MISSING DATA , 1975 .

[52]  Mingming Gong,et al.  Federated Causal Discovery , 2021, ArXiv.

[53]  Hao Chen,et al.  Stochastic Gradient Descent in Correlated Settings: A Study on Gaussian Processes , 2020, NeurIPS.

[54]  E. Bareinboim,et al.  Generalized Transportability:Synthesis of Experiments from Heterogeneous Domains , 2019 .

[55]  Aidong Zhang,et al.  Representation Learning for Treatment Effect Estimation from Observational Data , 2018, NeurIPS.

[56]  J. Chemali,et al.  Summary and discussion of “ The central role of the propensity score in observational studies for causal effects , 2014 .

[57]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[58]  Donald B. Rubin,et al.  Bayesian Inference for Causal Effects: The Role of Randomization , 1978 .

[59]  R. Zemel,et al.  Causal Effect Inference with Deep Latent-Variable Models , 2017, NIPS.