A general framework for causal classification

In many applications, there is a need to predict the effect of an intervention on different individuals from data. For example, which customers are persuadable by a product promotion? which patients should be treated? These are typical causal questions involving the effect or the change in outcomes made by an intervention. The questions cannot be answered with traditional classification methods as they only deal with static outcomes. For personalised marketing, these questions are often answered with uplift modelling. The objective of uplift modelling is to estimate causal effect, but its literature does not discuss when the uplift represents casual effect. Causal heterogeneity modelling can solve the problem, but its assumption unconfoundedness is untestable in data. So practitioners need guidelines in their applications when using the methods. In this paper, we use casual classification for a set of personalised decision making problems, and differentiate it from classification. We discuss the conditions when causal classification can be resolved by uplift (and causal heterogeneity) modelling methods. We also propose a general framework for causal classification, by using off-the-shelf supervised methods for flexible implementations. Experiments have shown two instantiations of the framework work for causal classification and for uplift (causal heterogeneity) modelling, and are competitive with the other uplift (causal heterogeneity) modelling methods.

[1]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[2]  Pieter Abbeel,et al.  Transfer Learning for Estimating Causal Effects using Neural Networks , 2018, ArXiv.

[3]  Susan Athey,et al.  Recursive partitioning for heterogeneous causal effects , 2015, Proceedings of the National Academy of Sciences.

[4]  Nicholas Radcliffe,et al.  Using control groups to target on predicted lift: Building and assessing uplift model , 2007 .

[5]  Pierre Gutierrez,et al.  Causal Inference and Uplift Modelling: A Review of the Literature , 2017, PAPIs.

[6]  Sören R. Künzel,et al.  Metalearners for estimating heterogeneous treatment effects using machine learning , 2017, Proceedings of the National Academy of Sciences.

[7]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[8]  Szymon Jaroszewicz,et al.  Uplift Modeling in Direct Marketing , 2012 .

[9]  David J. Spiegelhalter,et al.  Probabilistic Networks and Expert Systems - Exact Computational Methods for Bayesian Networks , 1999, Information Science and Statistics.

[10]  Gregory F. Cooper,et al.  The ALARM Monitoring System: A Case Study with two Probabilistic Inference Techniques for Belief Networks , 1989, AIME.

[11]  David Page,et al.  Logical Differential Prediction Bayes Net, improving breast cancer diagnosis for older women , 2012, AMIA.

[12]  Carlos Fernandez Causal Classification : Treatment Effect vs . Outcome Estimation , 2018 .

[13]  Kathleen Kane,et al.  Mining for the truly responsive customers and prospects using true-lift modeling: Comparison of new and existing methods , 2014 .

[14]  D. Almond,et al.  The Costs of Low Birth Weight , 2004 .

[15]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .

[16]  LEO GUELMAN,et al.  Uplift Random Forests , 2015, Cybern. Syst..

[17]  Patrik O. Hoyer,et al.  Data-driven covariate selection for nonparametric estimation of causal effects , 2013, AISTATS.

[18]  Jiuyong Li,et al.  Causal query in observational data with hidden variables , 2020, ECAI.

[19]  Constantin F. Aliferis,et al.  HITON: A Novel Markov Blanket Algorithm for Optimal Variable Selection , 2003, AMIA.

[20]  David Page,et al.  Score As You Lift (SAYL): A Statistical Relational Learning Approach to Uplift Modeling , 2013, ECML/PKDD.

[21]  Uri Shalit,et al.  Estimating individual treatment effect: generalization bounds and algorithms , 2016, ICML.

[22]  J. M. Taylor,et al.  Subgroup identification from randomized clinical trial data , 2011, Statistics in medicine.

[23]  Max Welling,et al.  Causal Effect Inference with Deep Latent-Variable Models , 2017, NIPS 2017.

[24]  Stefan Lessmann,et al.  Conversion Uplift in E-Commerce: A Systematic Benchmark of Modeling Strategies , 2019, Int. J. Inf. Technol. Decis. Mak..

[25]  Patrick D. Surry,et al.  Differential Response Analysis: Modeling True Responses by Isolating the Effect of a Single Action , 1999 .

[26]  D. Rubin Estimating causal effects of treatments in randomized and nonrandomized studies. , 1974 .

[27]  T. Richardson,et al.  Covariate selection for the nonparametric estimation of an average treatment effect , 2011 .

[28]  D. Rubin,et al.  Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction , 2016 .

[29]  John Langford,et al.  Doubly Robust Policy Evaluation and Learning , 2011, ICML.

[30]  I. Shpitser,et al.  A New Criterion for Confounder Selection , 2011, Biometrics.

[31]  Constantin F. Aliferis,et al.  Local Causal and Markov Blanket Induction for Causal Discovery and Feature Selection for Classification Part I: Algorithms and Empirical Evaluation , 2010, J. Mach. Learn. Res..

[32]  S. Jaroszewicz,et al.  Uplift modeling for clinical trial data , 2012 .

[33]  Constantin F. Aliferis,et al.  The max-min hill-climbing Bayesian network structure learning algorithm , 2006, Machine Learning.

[34]  Xiaogang Su,et al.  Subgroup Analysis via Recursive Partitioning , 2009 .

[35]  Claus Skaanning Blocking Gibbs Sampling for Inference in Large and Complex Bayesian Networks with Applications in Genetics , 1997 .

[36]  Stefan Wager,et al.  Estimation and Inference of Heterogeneous Treatment Effects using Random Forests , 2015, Journal of the American Statistical Association.

[37]  Behram Hansotia,et al.  Incremental value modeling , 2002 .

[38]  Leo Guelman,et al.  Optimal personalized treatment rules for marketing interventions: A review of methods, a new proposal, and an insurance case study , 2014 .

[39]  Jenny Häggström,et al.  Data‐driven confounder selection via Markov and Bayesian networks , 2016, Biometrics.

[40]  Wouter Verbeke,et al.  A Literature Survey and Experimental Evaluation of the State-of-the-Art in Uplift Modeling: A Stepping Stone Toward the Development of Prescriptive Analytics , 2018, Big Data.

[41]  D. Rubin,et al.  Causal Inference for Statistics, Social, and Biomedical Sciences: A General Method for Estimating Sampling Variances for Standard Estimators for Average Causal Effects , 2015 .

[42]  Aidong Zhang,et al.  Representation Learning for Treatment Effect Estimation from Observational Data , 2018, NeurIPS.

[43]  Szymon Jaroszewicz,et al.  Decision trees for uplift modeling with single and multiple treatments , 2011, Knowledge and Information Systems.

[44]  Victor S. Y. Lo The true lift model: a novel data mining approach to response modeling in database marketing , 2002, SKDD.

[45]  Judea Pearl Causality by Judea Pearl , 2009 .

[46]  Patrick D. Surry,et al.  Real-World Uplift Modelling with Significance-Based Uplift Trees , 2012 .

[47]  Szymon Jaroszewicz,et al.  Decision Trees for Uplift Modeling , 2010, 2010 IEEE International Conference on Data Mining.