Credit card churn forecasting by logistic regression and decision tree

In this paper, two data mining algorithms are applied to build a churn prediction model using credit card data collected from a real Chinese bank. The contribution of four variable categories: customer information, card information, risk information, and transaction activity information are examined. The paper analyzes a process of dealing with variables when data is obtained from a database instead of a survey. Instead of considering the all 135 variables into the model directly, it selects the certain variables from the perspective of not only correlation but also economic sense. In addition to the accuracy of analytic results, the paper designs a misclassification cost measurement by taking the two types error and the economic sense into account, which is more suitable to evaluate the credit card churn prediction model. The algorithms used in this study include logistic regression and decision tree which are proven mature and powerful classification algorithms. The test result shows that regression performs a little better than decision tree.

[1]  Kristof Coussement,et al.  Faculteit Economie En Bedrijfskunde Hoveniersberg 24 B-9000 Gent Churn Prediction in Subscription Services: an Application of Support Vector Machines While Comparing Two Parameter-selection Techniques Churn Prediction in Subscription Services: an Application of Support Vector Machines While Comparin , 2022 .

[2]  Yong Shi,et al.  Finding the Hidden Pattern of Credit Card Holder's Churn: A Case of China , 2009, ICCS.

[3]  Kweku-Muata Osei-Bryson,et al.  Evaluation of decision trees: a multi-criteria approach , 2004, Comput. Oper. Res..

[4]  Amelia Rodríguez Martín,et al.  Overweight and obesity: The role of education, employment and income in Spanish adults , 2008, Appetite.

[5]  Yu Zhao,et al.  Customer Churn Prediction Using Improved One-Class Support Vector Machine , 2005, ADMA.

[6]  Ramon J. Aldag,et al.  Complexity and Familiarity with Computer Assistance when Making Ill-Structured Business Decisions , 2009, Int. J. Inf. Technol. Decis. Mak..

[7]  Zhengxin Chen,et al.  A Multi-criteria Convex Quadratic Programming model for credit data analysis , 2008, Decis. Support Syst..

[8]  Yong Shi,et al.  The Analysis on the Customers Churn of Charge Email Based on Data Mining Take One Internet Company for Example , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[9]  Wagner A. Kamakura,et al.  Defection Detection: Measuring and Understanding the Predictive Accuracy of Customer Churn Models , 2006 .

[10]  Zhengxin Chen,et al.  Classifying Credit Card Accounts for Business Intelligence and Decision Making: a Multiple-criteria Quadratic Programming Approach , 2005, Int. J. Inf. Technol. Decis. Mak..

[11]  Chih-Fong Tsai,et al.  Earnings management prediction: A pilot study of combining neural networks and decision trees , 2009, Expert Syst. Appl..

[12]  Stanley F. Slater,et al.  Intelligence generation and superior customer value , 2000 .

[13]  Ashutosh Tiwari,et al.  Computer assisted customer churn management: State-of-the-art and future trends , 2007, Comput. Oper. Res..

[14]  Jing He,et al.  MCLP-based methods for improving "Bad" catching rate in credit cardholder behavior analysis , 2008, Appl. Soft Comput..

[15]  F. F. Reichheld,et al.  Zero defections: quality comes to services. , 1990, Harvard business review.

[16]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[17]  Desheng Dash Wu,et al.  Supplier selection: A hybrid model using DEA, decision tree and neural network , 2009, Expert Syst. Appl..

[18]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  Protein cellular localization prediction with Support Vector Machines and Decision Trees , 2007, Comput. Biol. Medicine.

[19]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[20]  張 毓騰,et al.  APPLYING DATA MINING TO TELECOM CHURN MANAGEMENT , 2009 .

[21]  Bart Baesens,et al.  Modeling churn using customer lifetime value , 2009, Eur. J. Oper. Res..

[22]  Tian-Shyug Lee,et al.  Mining the customer credit using classification and regression tree and multivariate adaptive regression splines , 2006, Comput. Stat. Data Anal..

[23]  A. S. Al-Ghamdi Using logistic regression to estimate the influence of accident factors on accident severity. , 2002, Accident; analysis and prevention.

[24]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[25]  A. Gustafsson,et al.  The Effects of Customer Satisfaction, Relationship Commitment Dimensions, and Triggers on Customer Retention , 2005 .

[26]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[27]  John H. Roberts Developing new rules for new markets , 2000 .

[28]  Fred W. Glover,et al.  Simulation Optimization: Applications in Risk Management , 2008, Int. J. Inf. Technol. Decis. Mak..

[29]  D. Collings,et al.  Valuing customers , 2005 .

[30]  Dirk Van den Poel,et al.  Customer attrition analysis for financial services using proportional hazard models , 2004, Eur. J. Oper. Res..

[31]  Dirk Van den Poel,et al.  Investigating the role of product features in preventing customer churn, by using survival analysis and choice modeling: The case of financial services , 2004, Expert Syst. Appl..

[32]  David C. Yen,et al.  Applying data mining to telecom churn management , 2006, Expert Syst. Appl..

[33]  D. Hosmer,et al.  Applied Logistic Regression , 1991 .

[34]  Concha Bielza,et al.  Logistic regression for simulating damage occurrence on a fruit grading line , 2003 .

[35]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[36]  Dirk Van den Poel,et al.  Customer base analysis: partial defection of behaviourally loyal clients in a non-contractual FMCG retail setting , 2005, Eur. J. Oper. Res..