论文信息 - Active Learning for Decision-Making from Imbalanced Observational Data

Active Learning for Decision-Making from Imbalanced Observational Data

Machine learning can help personalized decision support by learning models to predict individual treatment effects (ITE). This work studies the reliability of prediction-based decision-making in a task of deciding which action $a$ to take for a target unit after observing its covariates $\tilde{x}$ and predicted outcomes $\hat{p}(\tilde{y} \mid \tilde{x}, a)$. An example case is personalized medicine and the decision of which treatment to give to a patient. A common problem when learning these models from observational data is imbalance, that is, difference in treated/control covariate distributions, which is known to increase the upper bound of the expected ITE estimation error. We propose to assess the decision-making reliability by estimating the ITE model's Type S error rate, which is the probability of the model inferring the sign of the treatment effect wrong. Furthermore, we use the estimated reliability as a criterion for active learning, in order to collect new (possibly expensive) observations, instead of making a forced choice based on unreliable predictions. We demonstrate the effectiveness of this decision-making aware active learning in two decision-making tasks: in simulated data with binary outcomes and in a medical dataset with synthetic and continuous treatment outcomes.

[1] Jennifer L. Hill,et al. Bayesian Nonparametric Modeling for Causal Inference , 2011 .

[2] G. Imbens,et al. Efficient Estimation of Average Treatment Effects Using the Estimated Propensity Score , 2000 .

[3] Alexander D'Amour,et al. Overlap in observational studies with high-dimensional covariates , 2017, Journal of Econometrics.

[4] Andreas Krause,et al. Near Optimal Bayesian Active Learning for Decision Making , 2014, AISTATS.

[5] Andrew McCallum,et al. Toward Optimal Active Learning through Monte Carlo Estimation of Error Reduction , 2001, ICML 2001.

[6] Solon Barocas,et al. Prediction-Based Decisions and Fairness: A Catalogue of Choices, Assumptions, and Definitions , 2018, 1811.07867.

[7] J. Lunceford,et al. Stratification and weighting via the propensity score in estimation of causal treatment effects: a comparative study , 2004, Statistics in medicine.

[8] Jiqiang Guo,et al. Stan: A Probabilistic Programming Language. , 2017, Journal of statistical software.

[9] Joaquin Quiñonero Candela,et al. Counterfactual reasoning and learning systems: the example of computational advertising , 2013, J. Mach. Learn. Res..

[10] Uri Shalit,et al. Estimating individual treatment effect: generalization bounds and algorithms , 2016, ICML.

[11] Peter Bühlmann,et al. Two optimal strategies for active learning of causal models from interventional data , 2012, Int. J. Approx. Reason..

[12] J. Pearl,et al. Causal inference , 2011, Twenty-one Mental Models That Can Change Policing.

[13] Foster J. Provost,et al. Decision-Centric Active Learning of Binary-Outcome Models , 2007, Inf. Syst. Res..

[14] Donald B. Rubin,et al. Bayesian Inference for Causal Effects: The Role of Randomization , 1978 .

[15] J. Pearl. Causality: Models, Reasoning and Inference , 2000 .

[16] Carlos Guestrin,et al. "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[17] D. Rubin. [On the Application of Probability Theory to Agricultural Experiments. Essay on Principles. Section 9.] Comment: Neyman (1923) and Causal Inference in Experiments and Observational Studies , 1990 .

[18] Foster Provost,et al. Active Learning for Decision Making , 2004 .

[19] G. Imbens,et al. Efficient Estimation of Average Treatment Effects Using the Estimated Propensity Score , 2002 .

[20] Tara Javidi,et al. Active Learning with Logged Data , 2018, ICML.

[21] Joseph Hilbe,et al. Data Analysis Using Regression and Multilevel/Hierarchical Models , 2009 .

[22] Ashish Kapoor,et al. FeatureInsight: Visual support for error-driven feature ideation in text classification , 2015, 2015 IEEE Conference on Visual Analytics Science and Technology (VAST).

[23] J. Robins,et al. Doubly Robust Estimation in Missing Data and Causal Inference Models , 2005, Biometrics.

[24] Uri Shalit,et al. Learning Representations for Counterfactual Inference , 2016, ICML.

[25] Suchi Saria,et al. A Bayesian Nonparametic Approach for Estimating Individualized Treatment-Response Curves , 2016, ArXiv.

[26] Francis Tuerlinckx,et al. Type S error rates for classical and Bayesian single and multiple comparison procedures , 2000, Comput. Stat..

[27] Samuel Kaski,et al. Improving genomics-based predictions for precision medicine through active elicitation of expert knowledge , 2017, Bioinform..

[28] Mihaela van der Schaar,et al. Bayesian Inference of Individualized Treatment Effects using Multi-task Gaussian Processes , 2017, NIPS.

[29] Marie Davidian,et al. Doubly robust estimation of causal effects. , 2011, American journal of epidemiology.

[30] Judea Pearl,et al. Causal Inference , 2010 .

[31] D. Rubin,et al. Reducing Bias in Observational Studies Using Subclassification on the Propensity Score , 1984 .

[32] Bin Li,et al. A survey on instance selection for active learning , 2012, Knowledge and Information Systems.

[33] D. Rubin,et al. Causal Inference for Statistics, Social, and Biomedical Sciences: A General Method for Estimating Sampling Variances for Standard Estimators for Average Causal Effects , 2015 .

[34] Bernhard Schölkopf,et al. A Kernel Two-Sample Test , 2012, J. Mach. Learn. Res..

[35] Stefan Wager,et al. Estimation and Inference of Heterogeneous Treatment Effects using Random Forests , 2015, Journal of the American Statistical Association.

[36] Suchi Saria,et al. Reliable Decision Support using Counterfactual Models , 2017, NIPS.

[37] Andrew McCallum,et al. Reducing Labeling Effort for Structured Prediction Tasks , 2005, AAAI.