Using machine learning techniques to predict defection of top clients

Fierce competition in many industries causes switching behavior of customers. Because foregone profits of defected customers are significant, an increase of the retention rate can be very profitable. In this paper, we focus on the treatment of companies' most promising current customers in a non-contractual setting. We build a model in order to predict chum behavior of top clients who will (partially) defect in the near future. We applied the following classification techniques: logistic regression, linear discriminant analysis, quadratic discriminant analysis, C4.5, neural networks and Naive Bayes. Their performance is quantified by the classification accuracy and the area under the receiver operating characteristic curve (AUROC). The experiments were carried out on a real life data set obtained by a Belgian retailer. The article contributes in many ways. The results show that past customer behavior has predictive power to indicate future partial defection. This finding is from a companies' point of view even more important than being able to define total defectors, which was until now the traditional goal in attrition research. It was found that neural networks performed better than the other classification techniques in terms of both classification accuracy and AUROC. Although the performance benefits are sometimes small in absolute terms, they are statistically significant and relevant from a marketing perspective. Finally it was found that the number of past shop visits and the time between past shop incidences are amongst the most predictive inputs for the problem at hand.

[1]  D. Mackay,et al.  Bayesian methods for adaptive models , 1992 .

[2]  J. Hanley,et al.  A method of comparing the areas under receiver operating characteristic curves derived from the same cases. , 1983, Radiology.

[3]  Couchen Wu,et al.  Counting your customers: Compounding customer's in-store decisions, interpurchase time and repurchasing behavior , 2000, Eur. J. Oper. Res..

[4]  James P. Egan,et al.  Signal detection theory and ROC analysis , 1975 .

[5]  J. Swets ROC analysis applied to the evaluation of medical imaging techniques. , 1979, Investigative radiology.

[6]  Thomas G. Dietterich Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms , 1998, Neural Computation.

[7]  W. Reinartz,et al.  On the Profitability of Long-Life Customers in a Noncontractual Setting: An Empirical Investigation and Implications for Marketing , 2000 .

[8]  R. Mizerski An Attribution Explanation of the Disproportionate Influence of Unfavorable Information , 1982 .

[9]  A. Athanassopoulos Customer Satisfaction Cues To Support Market Segmentation and Explain Switching Behavior , 2000 .

[10]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[11]  R. Rust,et al.  Customer satisfaction, customer retention, and market share , 1993 .

[12]  E. DeLong,et al.  Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. , 1988, Biometrics.

[13]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[14]  Eric Johnson,et al.  Predicting subscriber dissatisfaction and improving retention in the wireless telecommunications industry , 2000, IEEE Trans. Neural Networks Learn. Syst..

[15]  Jaishankar Ganesh,et al.  Understanding the Customer Base of Service Providers: An Examination of the Differences between Switchers and Stayers , 2000, Journal of Marketing.

[16]  David L. Mothersbaugh,et al.  Switching barriers and repurchase intentions in services , 2000 .