Weighted Gaussian Process for Estimating Treatment Effect

Estimating treatment effect is crucial in many fields, including but not limited to medicine, psychology and economics. Accurate estimation of treatment effect is difficult in most observational studies, as those collected examples are inevitably biased: distributions of sample covariates between treatment and control groups are misaligned due to experimental conditions or constraints. To address this issue, we borrow covariate shift correction techniques from the transfer machine learning community and incorporate them into weighted Gaussian process for effective bias correction. Our method can (1) correct sample bias, (2) predict both population and individual treatment effects, and (3) provide corresponding confidence intervals.

[1]  Mehryar Mohri,et al.  Adaptation Algorithm and Theory Based on Generalized Discrepancy , 2014, KDD.

[2]  Combining Medications to Enhance Depression Outcomes (CO-MED): Acute and Long-Term Outcomes of a Single-Blind Randomized Study , 2012 .

[3]  Motoaki Kawanabe,et al.  Direct Importance Estimation with Model Selection and Its Application to Covariate Shift Adaptation , 2007, NIPS.

[4]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .

[5]  Takafumi Kanamori,et al.  A Least-squares Approach to Direct Importance Estimation , 2009, J. Mach. Learn. Res..

[6]  Russell Greiner,et al.  Robust Learning under Uncertain Test Distributions: Relating Covariate Shift to Model Misspecification , 2014, ICML.

[7]  Yair Weiss,et al.  Segmentation using eigenvectors: a unifying view , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[8]  Jared K Lunceford,et al.  Stratification and weighting via the propensity score in estimation of causal treatment effects: a comparative study. , 2017, Statistics in medicine.

[9]  Takafumi Kanamori,et al.  Density Ratio Estimation in Machine Learning , 2012 .

[10]  Klaus Nordhausen,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition by Trevor Hastie, Robert Tibshirani, Jerome Friedman , 2009 .

[11]  H. Shimodaira,et al.  Improving predictive inference under covariate shift by weighting the log-likelihood function , 2000 .

[12]  Ron Kohavi,et al.  Seven pitfalls to avoid when running controlled experiments on the web , 2009, KDD.

[13]  Elizabeth A Stuart,et al.  Matching methods for causal inference: A review and a look forward. , 2010, Statistical science : a review journal of the Institute of Mathematical Statistics.

[14]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[15]  P. Austin An Introduction to Propensity Score Methods for Reducing the Effects of Confounding in Observational Studies , 2011, Multivariate behavioral research.

[16]  D. Rubin Matched Sampling for Causal Effects , 2006 .

[17]  Christopher K. I. Williams,et al.  Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning) , 2005 .

[18]  Motoaki Kawanabe,et al.  Machine Learning in Non-Stationary Environments - Introduction to Covariate Shift Adaptation , 2012, Adaptive computation and machine learning.

[19]  H. Chipman,et al.  BART: Bayesian Additive Regression Trees , 2008, 0806.3286.

[20]  Bernhard Schölkopf,et al.  Correcting Sample Selection Bias by Unlabeled Data , 2006, NIPS.

[21]  M. J. van der Laan,et al.  The International Journal of Biostatistics Targeted Maximum Likelihood Learning , 2011 .

[22]  Martha White,et al.  Optimal reverse prediction: a unified perspective on supervised, unsupervised and semi-supervised learning , 2009, ICML '09.