论文信息 - Limits of Estimating Heterogeneous Treatment Effects: Guidelines for Practical Algorithm Design

Limits of Estimating Heterogeneous Treatment Effects: Guidelines for Practical Algorithm Design

Estimating heterogeneous treatment effects from observational data is a central problem in many domains. Because counterfactual data is inaccessible, the problem differs fundamentally from supervised learning, and entails a more complex set of modeling choices. Despite a variety of recently proposed algorithmic solutions, a principled guideline for building estimators of treatment effects using machine learning algorithms is still lacking. In this paper, we provide such a guideline by characterizing the fundamental limits of estimating heterogeneous treatment effects, and establishing conditions under which these limits can be achieved. Our analysis reveals that the relative importance of the different aspects of observational data vary with the sample size. For instance, we show that selection bias matters only in small-sample regimes, whereas with a large sample size, the way an algorithm models the control and treated outcomes is what bottlenecks its performance. Guided by our analysis, we build a practical algorithm for estimating treatment effects using a non-stationary Gaussian processes with doubly-robust hyperparameters. Using a standard semi-synthetic simulation setup, we show that our algorithm outperforms the state-of-the-art, and that the behavior of existing algorithms conforms with our analysis.

Mihaela van der Schaar | Ahmed M. Alaa | M. Schaar | A. Alaa

[1] Sören R. Künzel,et al. Meta-learners for Estimating Heterogeneous Treatment Effects using Machine Learning , 2017 .

[2] Neil D. Lawrence,et al. Kernels for Vector-Valued Functions: a Review , 2011, Found. Trends Mach. Learn..

[3] Martin J. Wainwright,et al. Lower bounds on minimax rates for nonparametric regression with additive sparsity and smoothness , 2009, NIPS.

[4] S. Dudoit,et al. Asymptotics of cross-validated risk estimation in estimator selection and performance assessment , 2005 .

[5] Yu Xie,et al. Estimating Heterogeneous Treatment Effects with Observational Data , 2012, Sociological methodology.

[6] I. Castillo. Lower bounds for posterior rates with Gaussian process priors , 2008, 0807.2734.

[7] D. Rubin,et al. Reducing Bias in Observational Studies Using Subclassification on the Propensity Score , 1984 .

[8] Hemant Ishwaran,et al. Estimating Individual Treatment Effect in Observational Data Using Random Forest Methods , 2017, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America.

[9] Stefan Wager,et al. Estimation and Inference of Heterogeneous Treatment Effects using Random Forests , 2015, Journal of the American Statistical Association.

[10] A. W. van der Vaart,et al. Adaptive Bayesian credible bands in regression with a Gaussian process prior , 2015, Sankhya A.

[11] A. W. Vaart,et al. Reproducing kernel Hilbert spaces of Gaussian priors , 2008, 0805.3252.

[12] Joaquin Quiñonero Candela,et al. Counterfactual reasoning and learning systems: the example of computational advertising , 2013, J. Mach. Learn. Res..

[13] Harry van Zanten,et al. Information Rates of Nonparametric Gaussian Process Methods , 2011, J. Mach. Learn. Res..

[14] D. Rubin. Causal Inference Using Potential Outcomes , 2005 .

[15] Masashi Sugiyama,et al. Mixture Regression for Covariate Shift , 2006, NIPS.

[16] Ahmed M. Alaa,et al. Bayesian Nonparametric Causal Inference: Information Rates and Learning Algorithms , 2017, IEEE Journal of Selected Topics in Signal Processing.

[17] Uri Shalit,et al. Learning Representations for Counterfactual Inference , 2016, ICML.

[18] J. Heckman. Sample Selection Bias as a Specification Error (with an Application to the Estimation of Labor Supply Functions) , 1977 .

[19] Xiongzhi Chen. Brownian Motion and Stochastic Calculus , 2008 .

[20] James M. Robins,et al. Optimal Structural Nested Models for Optimal Sequential Decisions , 2004 .