Feature Selection Methods for Uplift Modeling

Uplift modeling is a predictive modeling technique that estimates the user-level incremental effect of a treatment using machine learning models. It is often used for targeting promotions and advertisements, as well as for the personalization of product offerings. In these applications, there are often hundreds of features available to build such models. Keeping all the features in a model can be costly and inefficient. Feature selection is an essential step in the modeling process for multiple reasons: improving the estimation accuracy by eliminating irrelevant features, accelerating model training and prediction speed, reducing the monitoring and maintenance workload for feature data pipeline, and providing better model interpretation and diagnostics capability. However, feature selection methods for uplift modeling have been rarely discussed in the literature. Although there are various feature selection methods for standard machine learning models, we will demonstrate that those methods are sub-optimal for solving the feature selection problem for uplift modeling. To address this problem, we introduce a set of feature selection methods designed specifically for uplift modeling, including both filter methods and embedded methods. To evaluate the effectiveness of the proposed feature selection methods, we use different uplift models and measure the accuracy of each model with a different number of selected features. We use both synthetic and real data to conduct these experiments. We also implemented the proposed filter methods in an open source Python package (CausalML).

[1]  Behram Hansotia,et al.  Incremental value modeling , 2002 .

[2]  Szymon Jaroszewicz,et al.  Decision trees for uplift modeling with single and multiple treatments , 2011, Knowledge and Information Systems.

[3]  Sören R. Künzel,et al.  Meta-learners for Estimating Heterogeneous Treatment Effects using Machine Learning , 2017 .

[4]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[5]  Huan Liu,et al.  Feature Selection for Classification: A Review , 2014, Data Classification: Algorithms and Applications.

[6]  Leo Guelman,et al.  Random Forests for Uplift Modeling: An Insurance Customer Retention Case , 2012, MS.

[7]  Ferat Sahin,et al.  A survey on feature selection methods , 2014, Comput. Electr. Eng..

[8]  Szymon Jaroszewicz,et al.  Ensemble methods for uplift modeling , 2014, Data Mining and Knowledge Discovery.

[9]  David Simchi-Levi,et al.  Uplift Modeling with Multiple Treatments and General Response Types , 2017, SDM.

[10]  D. Rubin Causal Inference Using Potential Outcomes , 2005 .

[11]  Justin Grimmer,et al.  Estimating Heterogeneous Treatment Effects and the Effects of Heterogeneous Treatments with Ensemble Methods , 2017, Political Analysis.

[12]  Ralf Klinkenberg,et al.  Data Classification: Algorithms and Applications , 2014 .

[13]  Verónica Bolón-Canedo,et al.  A review of feature selection methods on synthetic data , 2013, Knowledge and Information Systems.

[14]  Sören R. Künzel,et al.  Metalearners for estimating heterogeneous treatment effects using machine learning , 2017, Proceedings of the National Academy of Sciences.

[15]  P. Holland Statistics and Causal Inference , 1985 .

[16]  Susan Athey,et al.  Recursive partitioning for heterogeneous causal effects , 2015, Proceedings of the National Academy of Sciences.

[17]  Tie-Yan Liu,et al.  LightGBM: A Highly Efficient Gradient Boosting Decision Tree , 2017, NIPS.

[18]  LEO GUELMAN,et al.  Uplift Random Forests , 2015, Cybern. Syst..

[19]  Jeong-Yoon Lee,et al.  CausalML: Python Package for Causal Machine Learning , 2020, ArXiv.

[20]  Tara N. Sainath,et al.  Improving deep neural networks for LVCSR using rectified linear units and dropout , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[21]  D. Rubin Estimating causal effects of treatments in randomized and nonrandomized studies. , 1974 .

[22]  Xinkun Nie,et al.  Quasi-oracle estimation of heterogeneous treatment effects , 2017, Biometrika.

[23]  Stefan Wager,et al.  Estimation and Inference of Heterogeneous Treatment Effects using Random Forests , 2015, Journal of the American Statistical Association.

[24]  Pierre Gutierrez,et al.  Causal Inference and Uplift Modelling: A Review of the Literature , 2017, PAPIs.

[25]  Zhenyu Zhao,et al.  Uplift Modeling for Multiple Treatments with Cost Optimization , 2019, 2019 IEEE International Conference on Data Science and Advanced Analytics (DSAA).

[26]  S. Athey,et al.  Generalized random forests , 2016, The Annals of Statistics.

[27]  Szymon Jaroszewicz,et al.  Support Vector Machines for Uplift Modeling , 2013, 2013 IEEE 13th International Conference on Data Mining Workshops.