Optimising Individual-Treatment-Effect Using Bandits

Applying causal inference models in areas such as economics, healthcare and marketing receives great interest from the machine learning community. In particular, estimating the individual-treatment-effect (ITE) in settings such as precision medicine and targeted advertising has peaked in application. Optimising this ITE under the strong-ignorability-assumption -- meaning all confounders expressing influence on the outcome of a treatment are registered in the data -- is often referred to as uplift modeling (UM). While these techniques have proven useful in many settings, they suffer vividly in a dynamic environment due to concept drift. Take for example the negative influence on a marketing campaign when a competitor product is released. To counter this, we propose the uplifted contextual multi-armed bandit (U-CMAB), a novel approach to optimise the ITE by drawing upon bandit literature. Experiments on real and simulated data indicate that our proposed approach compares favourably against the state-of-the-art. All our code can be found online at this https URL.

[1]  Patrick D. Surry,et al.  Real-World Uplift Modelling with Significance-Based Uplift Trees , 2012 .

[2]  H. Robbins Some aspects of the sequential design of experiments , 1952 .

[3]  Pierre Geurts,et al.  Tree-Based Batch Mode Reinforcement Learning , 2005, J. Mach. Learn. Res..

[4]  Alexey Tsymbal,et al.  The problem of concept drift: definitions and related work , 2004 .

[5]  Doina Precup,et al.  Algorithms for multi-armed bandit problems , 2014, ArXiv.

[6]  Xiao Fang,et al.  Uplift modeling for randomized experiments and observational studies , 2018 .

[7]  Wouter Verbeke,et al.  Causal Simulations for Uplift Modeling , 2019, ArXiv.

[8]  João Gama,et al.  A survey on concept drift adaptation , 2014, ACM Comput. Surv..

[9]  Li Zhou,et al.  A Survey on Contextual Multi-armed Bandits , 2015, ArXiv.

[10]  Uri Shalit,et al.  Learning Representations for Counterfactual Inference , 2016, ICML.

[11]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[12]  D. Rubin Causal Inference Using Potential Outcomes , 2005 .

[13]  Wouter Verbeke,et al.  A Literature Survey and Experimental Evaluation of the State-of-the-Art in Uplift Modeling: A Stepping Stone Toward the Development of Prescriptive Analytics , 2018, Big Data.

[14]  Pierre Gutierrez,et al.  Causal Inference and Uplift Modelling: A Review of the Literature , 2017, PAPIs.

[15]  Kathleen Kane,et al.  Mining for the truly responsive customers and prospects using true-lift modeling: Comparison of new and existing methods , 2014 .

[16]  Uri Shalit,et al.  Estimating individual treatment effect: generalization bounds and algorithms , 2016, ICML.

[17]  Horst Bischof,et al.  On-line Random Forests , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.