Decision trees for uplift modeling with single and multiple treatments

Most classification approaches aim at achieving high prediction accuracy on a given dataset. However, in most practical cases, some action such as mailing an offer or treating a patient is to be taken on the classified objects, and we should model not the class probabilities themselves, but instead, the change in class probabilities caused by the action. The action should then be performed on those objects for which it will be most profitable. This problem is known as uplift modeling, differential response analysis, or true lift modeling, but has received very little attention in machine learning literature. An important modification of the problem involves several possible actions, when for each object, the model must also decide which action should be used in order to maximize profit. In this paper, we present tree-based classifiers designed for uplift modeling in both single and multiple treatment cases. To this end, we design new splitting criteria and pruning methods. The experiments confirm the usefulness of the proposed approaches and show significant improvement over previous uplift modeling techniques.

[1]  G. Toussaint Probability of error, expected divergence, and the affinity of several distributions , 1978 .

[2]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[3]  J. Ross Quinlan,et al.  Simplifying Decision Trees , 1987, Int. J. Man Mach. Stud..

[4]  Inder Jeet Taneja,et al.  On Generalized Information Measures and Their Applications , 1989 .

[5]  Wray L. Buntine,et al.  Learning classification trees , 1992 .

[6]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[7]  J. Robins Correcting for non-compliance in randomized trials using structural nested mean models , 1994 .

[8]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[9]  Els Goetghebeur,et al.  The Effect of Treatment Compliance in a Placebo‐controlled Trial: Regression with Unpaired Data , 1997 .

[10]  Gediminas Adomavicius,et al.  Discovery of Actionable Patterns in Databases: the Action Hierarchy Approach , 1997, KDD.

[11]  Patrick D. Surry,et al.  Differential Response Analysis: Modeling True Responses by Isolating the Effect of a Single Action , 1999 .

[12]  Lillian Lee,et al.  Measures of Distributional Similarity , 1999, ACL.

[13]  David Maxwell Chickering,et al.  A Decision Theoretic Approach to Targeted Advertising , 2000, UAI.

[14]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[15]  T. Han,et al.  Mathematics of information and coding , 2001 .

[16]  S. Jaroszewicz,et al.  A General Measure of Rule Interestingness , 2001, PKDD.

[17]  Behram Hansotia,et al.  Incremental value modeling , 2002 .

[18]  韓 太舜,et al.  Mathematics of information and coding , 2002 .

[19]  Victor S. Y. Lo The true lift model: a novel data mining approach to response modeling in database marketing , 2002, SKDD.

[20]  Imre Csiszár,et al.  Information Theory and Statistics: A Tutorial , 2004, Found. Trends Commun. Inf. Theory.

[21]  Naoki Abe,et al.  Cross channel optimized marketing by reinforcement learning , 2004, KDD.

[22]  J. Robins,et al.  Estimation of treatment effects in randomised trials with non-compliance and a dichotomous outcome using structural mean models , 2004 .

[23]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[24]  Angelina A. Tzacheva,et al.  Action rules mining , 2005, Int. J. Intell. Syst..

[25]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[26]  Nicholas Radcliffe,et al.  Using control groups to target on predicted lift: Building and assessing uplift model , 2007 .

[27]  T. T. Have,et al.  An introduction to causal modeling in clinical trials , 2007, Clinical trials.

[28]  Szymon Jaroszewicz,et al.  Schema matching on streams with accuracy guarantees , 2008, Intell. Data Anal..

[29]  Zbigniew W. Ras,et al.  Action rule discovery from incomplete data , 2010, Knowledge and Information Systems.

[30]  Shichao Zhang,et al.  Cost-sensitive classification with respect to waiting cost , 2010, Knowl. Based Syst..

[31]  Richong Zhang,et al.  An information gain-based approach for recommending useful product reviews , 2011, Knowledge and Information Systems.

[32]  Tao Wang,et al.  Handling over-fitting in test cost-sensitive decision tree learning by feature selection, smoothing and pruning , 2010, J. Syst. Softw..

[33]  Szymon Jaroszewicz,et al.  Decision Trees for Uplift Modeling , 2010, 2010 IEEE International Conference on Data Mining.

[34]  Wlodzimierz Drabent,et al.  Hybrid rules with well-founded semantics , 2009, Knowledge and Information Systems.

[35]  Patrick D. Surry,et al.  Real-World Uplift Modelling with Significance-Based Uplift Trees , 2012 .

[36]  Stochastic Solutions Limited Identifying who can be saved and who wil l be driven away by retention activity Using uplift modelling to reduce churn in mobile telephony , .