Improving churn prediction in telecommunications using complementary fusion of multilayer features based on factorization and construction

High dimensional and unbalanced datasets are the main problems which prevent from achieving ideally churn prediction performance. Features selection is necessary to be adopted to enhance the model performance. A new predicting framework is proposed in this paper which uses complementary fusion of the multilayer features. Several subsets and new features were acquired according to feature factorization and feature construction respectively. The effective features were selected by multilayer complementary fusion which according to the contribution of feature subsets and new features. In this way, the imbalance defects of class distributions can be fixed, prediction accuracy can be improved and system stability can be reinforced. Five data mining models were applied in customer churn. Experimental results demonstrated that the method we proposed could preferably overcome the inelasticity of traditional feature selection algorithms, and more effective than those existing methods in telecommunications industry. Furthermore, we found optimal fusion with prediction model for customer churn prediction in telecommunications industry through exploring the advantages and limitations of each feature subset and prediction techniques.

[1]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[2]  David C. Yen,et al.  Applying data mining to telecom churn management , 2006, Expert Syst. Appl..

[3]  Hong Xu,et al.  Churn Prediction in Telecom Using a Hybrid Two-phase Feature Selection Method , 2009, 2009 Third International Symposium on Intelligent Information Technology Application.

[4]  Konstantinos Tsiptsis,et al.  Data Mining Techniques in CRM: Inside Customer Segmentation , 2010 .

[5]  Xue-wen Chen,et al.  Combating the Small Sample Class Imbalance Problem Using Feature Selection , 2010, IEEE Transactions on Knowledge and Data Engineering.

[6]  J. Mark Introduction to radial basis function networks , 1996 .

[7]  David D. Lewis,et al.  Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval , 1998, ECML.

[8]  Nir Friedman,et al.  Bayesian Network Classifiers , 1997, Machine Learning.

[9]  M. Tahar Kechadi,et al.  Customer churn prediction in telecommunications , 2012, Expert Syst. Appl..

[10]  Mohand Tahar Kechadi,et al.  A new filter feature selection approach for customer churn prediction in telecommunications , 2010, 2010 IEEE International Conference on Industrial Engineering and Engineering Management.

[11]  Chenn-Jung Huang,et al.  Application of wrapper approach and composite classifier to the stock trend prediction , 2008, Expert Syst. Appl..

[12]  N. Japkowicz Learning from Imbalanced Data Sets: A Comparison of Various Strategies * , 2000 .

[13]  Dirk Van den Poel,et al.  Handling class imbalance in customer churn prediction , 2009, Expert Syst. Appl..

[14]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[15]  Chih-Ping Wei,et al.  Turning telecommunications call details to churn prediction: a data mining approach , 2002, Expert Syst. Appl..

[16]  Richard Weber,et al.  A wrapper method for feature selection using Support Vector Machines , 2009, Inf. Sci..

[17]  M. Tahar Kechadi,et al.  A new feature set with new window techniques for customer churn prediction in land-line telecommunications , 2010, Expert Syst. Appl..