Profit driven data mining in massive customer networks: new insights and algorithms

Customer churn prediction models aim to detect customers with a high propensity to attrite. Both the predictive power, the comprehensibility, and the justifiability are key aspects of these models. An accurate model permits to correctly target future churners in customer retention campaigns, while a comprehensible and intuitive rule set allows to identify the main drivers for customers to churn and to develop an effective retention strategy in accordance with domain knowledge. This chapter provides an extended overview of the literature on the use of data mining for customer churn pre-

[1]  Euiho Suh,et al.  An LTV model and customer segmentation based on customer value: a case study on the wireless telecommunication industry , 2004, Expert Syst. Appl..

[2]  Stanley Wasserman,et al.  Social Network Analysis: Methods and Applications , 1994, Structural analysis in the social sciences.

[3]  Vadlamani Ravi,et al.  Predicting credit card customer churn in banks using data mining , 2008, Int. J. Data Anal. Tech. Strateg..

[4]  A Novel Credit Rating Migration Modeling Approach Using Macroeconomic Indicators , 2013 .

[5]  Sumit Sarkar,et al.  The Role of the Management Sciences in Research on Personalization , 2003, Manag. Sci..

[6]  Bart Baesens,et al.  Performance of classification models from a user perspective , 2011, Decis. Support Syst..

[7]  Said Salhi,et al.  An ant system algorithm for the mixed vehicle routing problem with backhauls , 2004 .

[8]  Zhou Shui Classification in Networked Data:A Survey , 2011 .

[9]  Bart Baesens,et al.  Mining software repositories for comprehensible software fault prediction models , 2008, J. Syst. Softw..

[10]  Bart Baesens,et al.  Software Effort Prediction Using Regression Rule Extraction from Neural Networks , 2010, 2010 22nd IEEE International Conference on Tools with Artificial Intelligence.

[11]  Lise Getoor,et al.  Link-Based Classification , 2003, Encyclopedia of Machine Learning and Data Mining.

[12]  Selwyn Piramuthu,et al.  Artificial Intelligence and Information Technology Evaluating feature selection methods for learning in data mining applications , 2004 .

[13]  Abraham Silberschatz,et al.  On Subjective Measures of Interestingness in Knowledge Discovery , 1995, KDD.

[14]  Foster J. Provost,et al.  Distribution-based aggregation for relational learning with identifier attributes , 2006, Machine Learning.

[15]  Hong Tang,et al.  Data mining techniques for cancer detection using serum proteomic profiling , 2004, Artif. Intell. Medicine.

[16]  Bart Baesens,et al.  Including Domain Knowledge in Customer Churn Prediction Using AntMiner+ , 2009, Industrial Conference on Data Mining - Workshop DMM.

[17]  Peter A. Flach,et al.  Propositionalization approaches to relational data mining , 2001 .

[18]  David J. Hand,et al.  Measuring classifier performance: a coherent alternative to the area under the ROC curve , 2009, Machine Learning.

[19]  Alex A. Freitas,et al.  An ant colony based system for data mining: applications to medical data , 2001 .

[20]  Gregory Piatetsky-Shapiro,et al.  Estimating campaign benefits and modeling lift , 1999, KDD '99.

[21]  Bart Baesens,et al.  Decompositional Rule Extraction from Support Vector Machines by Active Learning , 2009, IEEE Transactions on Knowledge and Data Engineering.

[22]  Bart Baesens,et al.  Comprehensible Credit Scoring Models Using Rule Extraction from Support Vector Machines , 2007, Eur. J. Oper. Res..

[23]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[24]  Bernard De Baets,et al.  Supervised ranking in the weka environment , 2010, Inf. Sci..

[25]  Monique Snoeck,et al.  Classification With Ant Colony Optimization , 2007, IEEE Transactions on Evolutionary Computation.

[26]  Chih-Ping Wei,et al.  Turning telecommunications call details to churn prediction: a data mining approach , 2002, Expert Syst. Appl..

[27]  Dirk Van den Poel,et al.  Predicting customer retention and profitability by using random forests and regression forests techniques , 2005, Expert Syst. Appl..

[28]  David J. Hand,et al.  ROC Curves for Continuous Data , 2009 .

[29]  Bart Baesens,et al.  Forecasting and analyzing insurance companies' ratings , 2007 .

[30]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[31]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.

[32]  Wagner A. Kamakura,et al.  Defection Detection: Measuring and Understanding the Predictive Accuracy of Customer Churn Models , 2006 .

[33]  Leon Sterling,et al.  Learning and classification of ordinal concepts , 1988 .

[34]  Bart Baesens,et al.  Building comprehensible customer churn prediction models with advanced rule induction techniques , 2011, Expert Syst. Appl..

[35]  M. McPherson,et al.  Birds of a Feather: Homophily in Social Networks , 2001 .

[36]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[37]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[38]  Bart Baesens,et al.  New insights into churn prediction in the telecommunication sector: A profit driven data mining approach , 2012, Eur. J. Oper. Res..

[39]  Bart Baesens,et al.  Predicting going concern opinion with data mining , 2008, Decis. Support Syst..

[40]  Bart Baesens,et al.  Benchmarking Classification Models for Software Defect Prediction: A Proposed Framework and Novel Findings , 2008, IEEE Transactions on Software Engineering.

[41]  Christophe Croux,et al.  Bagging and Boosting Classification Trees to Predict Churn , 2006 .

[42]  R. Rust,et al.  Customer satisfaction, customer retention, and market share , 1993 .

[43]  Bart Baesens,et al.  Rule based predictive models, decision table and tree: an empirical evaluation on comprehensibility , 2010 .

[44]  Bart Baesens,et al.  Profit optimizing customer churn prediction with Bayesian network classifiers , 2014, Intell. Data Anal..

[45]  J. J. Rocchio,et al.  Relevance feedback in information retrieval , 1971 .

[46]  Mark Newman,et al.  Networks: An Introduction , 2010 .

[47]  Francesco Viti,et al.  Sensor Locations for Reliable Travel Time Prediction and Dynamic Management of Traffic Networks , 2008 .

[48]  Eric D. Kolaczyk,et al.  Statistical Analysis of Network Data: Methods and Models , 2009 .

[49]  Bart Baesens,et al.  Credit scoring for microfinance: is it worth it? , 2012 .

[50]  Bart Baesens,et al.  An analysis of the applicability of credit scoring for microfinance , 2009 .

[51]  Marco Dorigo,et al.  Ant-Based Clustering and Topographic Mapping , 2006, Artificial Life.

[52]  David C. Yen,et al.  Applying data mining to telecom churn management , 2006, Expert Syst. Appl..

[53]  Evangelos Xevelonakis Developing retention strategies based on customer profitability in telecommunications: An empirical study , 2005 .

[54]  Michel Happiette,et al.  A neural clustering and classification system for sales forecasting of new apparel items , 2007, Appl. Soft Comput..

[55]  Bart Baesens,et al.  Mining social networks for customer churn prediction , 2011 .

[56]  Bart Baesens,et al.  Domain knowledge integration in data mining using decision tables: case studies in churn prediction , 2009, J. Oper. Res. Soc..

[57]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .

[58]  Bart Baesens,et al.  Comparing classification techniques to forecast customer churn , 2010 .

[59]  Balaji Padmanabhan,et al.  On the Use of Optimization for Data Mining: Theoretical Interactions and eCRM Opportunities , 2003, Manag. Sci..

[60]  Thomas Stützle,et al.  MAX-MIN Ant System , 2000, Future Gener. Comput. Syst..

[61]  Bart Baesens,et al.  A Novel Profit Maximizing Metric for Measuring Classification Performance of Customer Churn Prediction Models , 2013, IEEE Transactions on Knowledge and Data Engineering.

[62]  Jan Vanthienen,et al.  A tool-supported approach to inter-tabular verification , 1998 .

[63]  Gary Madden,et al.  Subscriber churn in the Australian ISP market , 1999 .

[64]  Oscar Castillo,et al.  Path planning for autonomous mobile robot navigation with ant colony optimization and fuzzy cost function evaluation , 2009, Appl. Soft Comput..

[65]  Bernhard Lang,et al.  Monotonic Multi-layer Perceptron Networks as Universal Approximators , 2005, ICANN.

[66]  Gregory Piatetsky-Shapiro,et al.  A Comparison of Approaches for Maximizing Business Payoff of Prediction Models , 1996, KDD.

[67]  Thomas Verbraken,et al.  The complementarity of networked and non-networked classifiers , 2011 .

[68]  F. Reichheld LEARNING FROM CUSTOMER DEFECTIONS , 1996 .

[69]  Donald K. Wedding,et al.  Discovering Knowledge in Data, an Introduction to Data Mining , 2005, Inf. Process. Manag..

[70]  Christos Faloutsos,et al.  Fast and Effective Retrieval of Medical Tumor Shapes , 1998, IEEE Trans. Knowl. Data Eng..

[71]  Sougata Mukherjea,et al.  Analyzing the Structure and Evolution of Massive Telecom Graphs , 2008, IEEE Transactions on Knowledge and Data Engineering.

[72]  A. Parasuraman,et al.  The Behavioral Consequences of Service Quality , 1996 .

[73]  Hussein A. Abbass,et al.  Classification rule discovery with ant colony optimization , 2003, IEEE/WIC International Conference on Intelligent Agent Technology, 2003. IAT 2003..

[74]  R. E. Lee,et al.  Distribution-free multiple comparisons between successive treatments , 1995 .

[75]  Michèle Paulin,et al.  Relational norms and client retention: external effectiveness of commercial banking in Canada and Mexico , 1998 .

[76]  Foster J. Provost,et al.  Aggregation-based feature invention and relational concept classes , 2003, KDD '03.

[77]  Jong Woo Kim,et al.  A hybrid classification method using error pattern modeling , 2008, Expert Syst. Appl..

[78]  MartensDavid Building acceptable classification models for financial engineering applications , 2008 .

[79]  A. J. Feelders,et al.  Classification trees for problems with monotonicity constraints , 2002, SKDD.

[80]  Chris Tampère,et al.  Modeling Traffic Operations on Intersections Using Monte-Carlo Simulation Techniques , 2008 .

[81]  Bart Baesens,et al.  Using Social Network Classifiers for Predicting E-Commerce Adoption , 2011, WEB.

[82]  Jennifer Neville,et al.  Relational Dependency Networks , 2007, J. Mach. Learn. Res..

[83]  Eric Johnson,et al.  Predicting subscriber dissatisfaction and improving retention in the wireless telecommunications industry , 2000, IEEE Trans. Neural Networks Learn. Syst..

[84]  Daniel T. Larose,et al.  Discovering Knowledge in Data: An Introduction to Data Mining , 2005 .

[85]  R. Mizerski An Attribution Explanation of the Disproportionate Influence of Unfavorable Information , 1982 .

[86]  John A. Swets,et al.  Evaluation of diagnostic systems : methods from signal detection theory , 1982 .

[87]  Bart Baesens,et al.  Ant-Based Approach to the Knowledge Fusion Problem , 2006, ANTS Workshop.

[88]  Bart Baesens,et al.  Customer churn prediction: does technique matter? , 2010 .

[89]  Jennifer Neville,et al.  Linkage and Autocorrelation Cause Feature Selection Bias in Relational Learning , 2002, ICML.

[90]  Yossi Richter,et al.  Predicting Customer Churn in Mobile Networks through Analysis of Social Groups , 2010, SDM.

[91]  Bart Baesens,et al.  Social network analysis for customer churn prediction , 2014, Appl. Soft Comput..

[92]  Eibe Frank,et al.  Logistic Model Trees , 2003, ECML.

[93]  Jan Vanthienen,et al.  An Illustration of Verification and Validation in the Modelling Phase of KBS Development , 1998, Data Knowl. Eng..

[94]  Ron Kohavi,et al.  The Case against Accuracy Estimation for Comparing Induction Algorithms , 1998, ICML.