Mining social networks for customer churn prediction

Customer churn prediction models aim to detect customers with a high propensity to attrite. This study investigates the applicability of relational learning techniques to predict customer churn using social network information. A range of existing, extended, and novel relational classifiers and collective inference procedures have been (re-) implemented and applied on two large-scale real life data sets obtained from two international telco operators, containing both networked (call detail record data) and non-networked (usage statistics, sociodemographic, marketing related) information about millions of customers. The results of the experiments indicate the existence of a limited but relevant impact of network effects on customer churn behavior. Incorporating higher order network efffects improves the predictive power of a customer churn prediction model. Collective inference procedures however deteriorate classification performance. 1. Social network mining for customer churn prediction Huge amounts of networked data on a broad range of network processes and information flows between interlinked entities have become available, such as for instance call logs linking telephone accounts (Dasgupta et al., 2008), money transfers connecting bank accounts, or hyperlinks relating web pages (Neville and Jensen, 2007). These massive data logs potentially hide information that is extremely valuable to companies and organizations, but as well is extremely dificult to discover due to the size and the fragmentation of the data. Networked data present both complications and opportunities for predictive data mining. The data are patently not independent and identically distributed, which introduces bias to learning and inference procedures (Jensen and Neville, 2002; Macskassy and Provost, 2007). Relational learning aims to exploit the information contained within the network structure of data instances, and to incorporate this information within a network classification or regression model (Džeroski and Lavrac, 2001; Getoor and Taskar, 2007). The aim of this study is to apply and develop relational learners to predict customer churn using social network information derived from call detail record (CDR) data, containing a vast amount of communication logs between customers of a telecom operator. 2. Network learning systems Macskassy and Provost (2007) introduced a node-centric, modular framework, with a network learning system consisting of: 1. a non-relational or local model, 2. a relational or network model, 3. a collective inference procedure. This framework is adopted in this study in order to compare stand-alone versions of network learners with combinations of a local classifier and a network model, both with and without collective inference procedures. To this end, two large real life data sets have been obtained from international telco operators, with the following characteristics: Mining social networks for customer churn prediction Wouter Verbeke1 Karel Dejaeger1 Thomas Verbraken1 David Martens2 Bart Baesens1 3 1 Faculty of Business and Economics, Department of Decision Sciences and Information Management, Katholieke Universiteit Leuven, Naamsestraat 69, B-3000 Leuven, Belgium 2 Faculty of Applied Economics, University of Antwerp, Prinsstraat 13, 2000 Antwerp, Belgium 3 School of Management, University of Southampton, Highfield Southampton, SO17 1BJ, United Kingdom wouter.verbeke@econ.kuleuven.be T: +32 16 32 68 87 F: +32 16 32 66 24 www.dataminingapps.com

[1]  Foster J. Provost,et al.  Classification in Networked Data: a Toolkit and a Univariate Case Study , 2007, J. Mach. Learn. Res..

[2]  Sougata Mukherjea,et al.  Social ties and their relevance to churn in mobile telecom networks , 2008, EDBT '08.

[3]  Piotr Indyk,et al.  Enhanced hypertext categorization using hyperlinks , 1998, SIGMOD '98.

[4]  Lise Getoor,et al.  Link-Based Classification , 2003, Encyclopedia of Machine Learning and Data Mining.

[5]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Jennifer Neville,et al.  Relational Dependency Networks , 2007, J. Mach. Learn. Res..

[7]  Jennifer Neville,et al.  Linkage and Autocorrelation Cause Feature Selection Bias in Relational Learning , 2002, ICML.