Detecting the Risk of Customer Churn in Telecom Sector: A Comparative Study

Churn rate describes the rate at which customers abandon a product or service. Identifying churn-risk customers is essential for telecom sectors to retain old customers and maintain a higher competitive advantage. The purpose of this paper is to explore an effective method for detecting the risk of customer churn in telecom sectors through comparing the advanced machine learning methods and their optimization algorithms. Based on two different telecom datasets, Mutual Information classifier was firstly utilized to select the most critical features relevant to customer churn. Next, the controlled-ratio undersampling strategy was employed to balance both minority and majority classes. Key hyperparameter optimization algorithms of Grid Search, Random Search, and Genetic Algorithms were then combined to fit the three promising machine learning models-Random Forest, Support Vector Machines, and K-nearest neighbors into the customer churn prediction problem. Six evaluation metrics-Accuracy, Recall, Precision, AUC, F1-score and Mean Absolute Error, were last used to evaluate the performance of the proposed models. The experimental results have revealed that the RF algorithm optimized by Grid Search based on a low-ratio undersampling strategy (RF-GS-LR) outperformed other models in extracting hidden information and understanding future churning behaviors of customers on both datasets, with the maximum accuracy of 99% and 95% on the applied dataset 1-2 respectively.

[1]  D. Gupta,et al.  Affinity and transformed class probability-based fuzzy least squares support vector machines , 2022, Fuzzy Sets Syst..

[2]  J. J. Emilyn,et al.  Hybrid Artificial Neural Networks Using Customer Churn Prediction , 2021, Wireless Personal Communications.

[3]  Na Helian,et al.  Anovel HEOMGA Approach for Class Imbalance Problem in the Application of Customer Churn Prediction , 2021, SN Computer Science.

[4]  Arwa A. Jamjoom The use of knowledge extraction in predicting customer churn in B2B , 2021, J. Big Data.

[5]  Muhammad Usman,et al.  Adaptive telecom churn prediction for concept-sensitive imbalance data streams , 2021, The Journal of Supercomputing.

[6]  A. Zakariazadeh Smart meter data classification using optimized random forest algorithm. , 2021, ISA transactions.

[7]  Mukesh Prasad,et al.  Data-driven mechanism based on fuzzy Lagrangian twin parametric-margin support vector machine for biomedical data analysis , 2021, Neural Computing and Applications.

[8]  Blessing Ojeme,et al.  Experimental Analysis of Hyperparameters for Deep Learning-Based Churn Prediction in the Banking Sector , 2021, Comput..

[9]  Katsuhiko Toyama,et al.  An Ensemble Framework of Multi-ratio Undersampling-based Imbalanced Classification , 2021 .

[10]  R. Yahaya,et al.  An Enhanced Bank Customers Churn Prediction Model Using A Hybrid Genetic Algorithm And K-Means Filter And Artificial Neural Network , 2021, 2020 IEEE 2nd International Conference on Cyberspac (CYBER NIGERIA).

[11]  Jinya Su,et al.  Sentinel-2 Satellite Imagery for Urban Land Cover Classification by Optimized Random Forest Classifier , 2021, Applied Sciences.

[12]  Abdallah Shami,et al.  Optimized Random Forest Model for Botnet Detection Based on DNS Queries , 2020, 2020 32nd International Conference on Microelectronics (ICM).

[13]  V. Kumar,et al.  Machine Learning Based Customer Churn Prediction In Banking , 2020, 2020 4th International Conference on Electronics, Communication and Aerospace Technology (ICECA).

[14]  Sumit Srivastava,et al.  Telecom churn prediction and used techniques, datasets and performance measures: a review , 2020, Telecommunication Systems.

[15]  Zhong Yao,et al.  Predicting the voluntary donation to online content creators , 2020, Ind. Manag. Data Syst..

[16]  Much Aziz Muslim,et al.  Improved Accuracy of Naive Bayes Classifier for Determination of Customer Churn Uses SMOTE and Genetic Algorithms , 2020, Journal of Soft Computing Exploration.

[17]  Felix T.S. Chan,et al.  Random Forest-Bayesian Optimization for Product Quality Prediction With Large-Scale Dimensions in Process Industrial Cyber–Physical Systems , 2020, IEEE Internet of Things Journal.

[18]  Deepak Gupta,et al.  Applying over 100 classifiers for churn prediction in telecom companies , 2020, Multimedia Tools and Applications.

[19]  Li Yang,et al.  On Hyperparameter Optimization of Machine Learning Algorithms: Theory and Practice , 2020, Neurocomputing.

[20]  Aytuğ Onan,et al.  Sentiment analysis on product reviews based on weighted word embeddings and deep neural networks , 2020, Concurr. Comput. Pract. Exp..

[21]  Sung Hoon Chung,et al.  Customer switching behavior analysis in the telecommunication industry via push-pull-mooring framework: A machine learning approach , 2020, Comput. Ind. Eng..

[22]  G. Hemanth Kumar,et al.  Churn Prediction of Customer in Telecom Industry using Machine Learning Algorithms , 2020 .

[23]  Kadan Aljoumaa,et al.  A comparative dimensionality reduction study in telecom customer segmentation using deep learning and PCA , 2020, Journal of Big Data.

[24]  Ee-Leng Tan,et al.  Breast Cancer Image Classification via Multi-Network Features and Dual-Network Orthogonal Low-Rank Learning , 2020, IEEE Access.

[25]  Marek Grzegorowski,et al.  Cluster-size optimization within a cloud-based ETL framework for Big Data , 2019, 2019 IEEE International Conference on Big Data (Big Data).

[26]  Deyi Li,et al.  Customer churn prediction based on LASSO and Random Forest models , 2019, IOP Conference Series: Materials Science and Engineering.

[27]  Aytuğ Onan,et al.  Two-Stage Topic Extraction Model for Bibliometric Data Analysis Based on Word Embeddings and Clustering , 2019, IEEE Access.

[28]  Jong-Seok Lee,et al.  AUC4.5: AUC-Based C4.5 Decision Tree Algorithm for Imbalanced Data Classification , 2019, IEEE Access.

[29]  Marcello Restelli,et al.  Feature Selection via Mutual Information: New Theoretical Insights , 2019, 2019 International Joint Conference on Neural Networks (IJCNN).

[30]  S. Sathya Bama,et al.  An Effective Classifier for Predicting Churn in Telecommunication , 2019 .

[31]  Junhao Wen,et al.  An Improved Random Forest Algorithm for Predicting Employee Turnover , 2019, Mathematical Problems in Engineering.

[32]  A. Ahmad,et al.  Customer churn prediction in telecom using machine learning in big data platform , 2019, Journal of Big Data.

[33]  Aytug Onan,et al.  Consensus Clustering-Based Undersampling Approach to Imbalanced Learning , 2019, Sci. Program..

[34]  P. Preetha,et al.  A Novel Efficiency Enhanced Classifier for Predicting the Attention-Deficit Hyperactivity Disorder , 2019, Journal of Advanced Research in Dynamical and Control Systems.

[35]  Sérgio Moro,et al.  Mutual information and sensitivity analysis for feature selection in customer targeting: A comparative study , 2019, J. Inf. Sci..

[36]  D. Maheswari,et al.  An enhanced ensemble classifier for telecom churn prediction using cost based uplift modelling , 2018, International Journal of Information Technology.

[37]  E. Sivasankar,et al.  Computing efficient features using rough set theory combined with ensemble classification techniques to improve the customer churn prediction in telecommunication sector , 2018, Computing.

[38]  Timothy A. Warner,et al.  Implementation of machine-learning classification in remote sensing: an applied review , 2018 .

[39]  Aytug Onan,et al.  An ensemble scheme based on language function analysis and feature engineering for text genre classification , 2018, J. Inf. Sci..

[40]  Vishal Mahajan,et al.  Review on factors affecting customer churn in telecom sector , 2017, Int. J. Data Anal. Tech. Strateg..

[41]  Kaizhu Huang,et al.  Customer churn prediction in the telecommunication sector using a rough set approach , 2017, Neurocomputing.

[42]  Alvis Cheuk M. Fong,et al.  A churn prediction model for prepaid customers in telecom using fuzzy classifiers , 2017, Telecommun. Syst..

[43]  Aytug Onan,et al.  Hybrid supervised clustering based ensemble scheme for text classification , 2017, Kybernetes.

[44]  Aytug Onan,et al.  A feature selection model based on genetic rank aggregation for text sentiment classification , 2017, J. Inf. Sci..

[45]  Aytug Onan,et al.  Ensemble of keyword extraction methods and classifiers in text classification , 2016, Expert Syst. Appl..

[46]  A. Keramati,et al.  Developing a prediction model for customer churn from electronic banking services using data mining , 2016 .

[47]  Konstantinos I. Diamantaras,et al.  A comparison of machine learning techniques for customer churn prediction , 2015, Simul. Model. Pract. Theory.

[48]  Jugal K. Kalita,et al.  MIFS-ND: A mutual information-based feature selection method , 2014, Expert Syst. Appl..

[49]  P. Ravilochanan,et al.  Churn Analytics on Indian Prepaid Mobile Services , 2014 .

[50]  Kjersti Aas,et al.  Modelling and predicting customer churn from an insurance company , 2014 .

[51]  Xin Yao,et al.  Online Class Imbalance Learning and its Applications in Fault Detection , 2013, Int. J. Comput. Intell. Appl..

[52]  Jacob Nunoo Christian Kyeremeh DETERMINANTS OF CUSTOMER LOYALTY AND SUBSCRIBER CHURN OF MOBILE PHONE SERVICES IN GHANA , 2012 .

[53]  Mona Nasr,et al.  A Proposed Churn Prediction Model , 2012 .

[54]  George P. Petropoulos,et al.  Support vector machines and object-based classification for obtaining land-use/cover cartography from Hyperion hyperspectral imagery , 2012, Comput. Geosci..

[55]  Steven E. Franklin,et al.  A comparison of pixel-based and object-based image analysis with selected machine learning algorithms for the classification of agricultural landscapes using SPOT-5 HRG imagery , 2012 .

[56]  Prabin Kumar Panigrahi,et al.  A Neural Network based Approach for Predicting Customer Churn in Cellular Network Services , 2011, ArXiv.

[57]  Guie Jiao,et al.  Analysis and Comparison of Forecasting Algorithms for Telecom Customer Churn , 2021 .

[58]  T. Jayasankar,et al.  Dynamic customer churn prediction strategy for business intelligence using text analytics with evolutionary optimization algorithms , 2021, Inf. Process. Manag..

[59]  Aytug Onan,et al.  A Term Weighted Neural Language Model and Stacked Bidirectional LSTM Based Framework for Sarcasm Identification , 2021, IEEE Access.

[60]  Ajay Khunteta,et al.  Churn Prediction in Telecommunication using Logistic Regression and Logit Boost , 2020, Procedia Computer Science.

[61]  Jong-Seok Lee,et al.  AUC 4 . 5 : AUC-Based C 4 . 5 Decision Tree Algorithm for Imbalanced Data Classification , 2019 .

[62]  K. Iyakutti,et al.  A Survey on Customer Churn Prediction in Telecom Industry: Datasets, Methods and Metrics , 2016 .

[63]  Ferat Sahin,et al.  A survey on feature selection methods , 2014, Comput. Electr. Eng..

[64]  Li Hong,et al.  Predicting Customer Churn in Mobile Telephony Industry Using Probabilistic Classifiers in Data Mining , 2013 .