A New Customer Churn Prediction Approach Based on Soft Set Ensemble Pruning

Accurate customer churn prediction is vital in any business organization due to higher cost involved in getting new customers. In telecommunication businesses, companies have used various types of single classifiers to classify customer churn, but the classification accuracy is still relatively low. However, the classification accuracy can be improved by integrating decisions from multiple classifiers through an ensemble method. Despite having the ability of producing the highest classification accuracy, ensemble methods have suffered significantly from their large volume of base classifiers. Thus, in the previous work, we have proposed a novel soft set based method to prune the classifiers from heterogeneous ensemble committee and select the best subsets of the component classifiers prior to the combination process. The results of the previous study demonstrated the ability of our proposed soft set ensemble pruning to reduce a substantial number of classifiers and at the same time producing the highest prediction accuracy. In this paper, we extended our soft set ensemble pruning on the customer churn dataset. The results of this work have proven that our proposed method of soft set ensemble pruning is able to overcome one of the drawbacks of ensemble method. Ensemble pruning based on soft set theory not only reduce the number of members of the ensemble, but able to increase the prediction accuracy of customer churn.

[1]  Johannes Fürnkranz,et al.  Incremental Reduced Error Pruning , 1994, ICML.

[2]  Philip S. Yu,et al.  Mining concept-drifting data streams using ensemble classifiers , 2003, KDD '03.

[3]  Daniel Hernández-Lobato,et al.  An Analysis of Ensemble Pruning Techniques Based on Ordered Aggregation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Rich Caruana,et al.  Ensemble selection from libraries of models , 2004, ICML.

[5]  D. Molodtsov Soft set theory—First results , 1999 .

[6]  Grigorios Tsoumakas,et al.  Ensemble Pruning Using Reinforcement Learning , 2006, SETN.

[7]  Andrzej Skowron,et al.  The Discernibility Matrices and Functions in Information Systems , 1992, Intelligent Decision Support.

[8]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[9]  Rich Caruana,et al.  Getting the Most Out of Ensemble Selection , 2006, Sixth International Conference on Data Mining (ICDM'06).

[10]  Mustafa Mat Deris,et al.  A Direct Proof of Every Rough Set is a Soft Set , 2009, 2009 Third Asia International Conference on Modelling & Simulation.

[11]  Juan José Rodríguez Diez,et al.  Rotation Forest: A New Classifier Ensemble Method , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Hedieh Sajedi,et al.  Ensemble pruning based on oblivious Chained Tabu Searches , 2016, Int. J. Hybrid Intell. Syst..

[13]  Steven Li,et al.  The normal parameter reduction of soft sets and its algorithm , 2008, Comput. Math. Appl..

[14]  Thomas G. Dietterich,et al.  Pruning Adaptive Boosting , 1997, ICML.

[15]  Joydeep Ghosh,et al.  Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..

[16]  Mohd Nordin Abdul Rahman,et al.  Data Mining for Churn Prediction: Multiple Regressions Approach , 2012, FGIT-EL/DTA/UNESST.

[17]  Huanhuan Chen,et al.  A Probabilistic Ensemble Pruning Algorithm , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[18]  Anil K. Jain,et al.  Clustering ensembles: models of consensus and weak partitions , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Tom Heskes,et al.  Clustering ensembles of neural network models , 2003, Neural Networks.

[20]  William Nick Street,et al.  Ensemble Pruning Via Semi-definite Programming , 2006, J. Mach. Learn. Res..

[21]  Mokhairi Makhtar,et al.  A Multi-Layer Perceptron Approach for Customer Churn Prediction , 2015, MUE 2015.

[22]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[23]  L. Breiman Stacked Regressions , 1996, Machine Learning.

[24]  A. R. Roy,et al.  Soft set theory , 2003 .

[25]  George D. C. Cavalcanti,et al.  META-DES: A dynamic ensemble selection framework using meta-learning , 2015, Pattern Recognit..

[26]  Mustafa Mat Deris,et al.  A new soft set based pruning algorithm for ensemble method , 2016 .

[27]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[28]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[29]  J. Gower Properties of Euclidean and non-Euclidean distance matrices , 1985 .

[30]  Yoram Singer,et al.  BoosTexter: A Boosting-based System for Text Categorization , 2000, Machine Learning.