Default avoidance on credit card portfolios using accounting, demographical and exploratory factors: decision making based on machine learning (ML) techniques

Effective and thorough credit-risk management is a key factor for lending institutions, as significant financial losses can arise from the borrowers’ default. Consequently, machine learning methods can measure and analyze credit risk objectively when at the same time they face increasingly attention. This study analyzes default payment data from a credit cards’ portfolio containing some 30,000 clients from Taiwan with twenty-three attributes and with no missing information. We compare prediction accuracy of seven classification methods used, i.e. KNN, Logistic Regression, Naïve Bayes, Decision Trees, Random Forest, SVC, and Linear SVC. The results indicate that only few out of most of the typical variables used can adequately analyze default characteristics in terms of lending decisions. The results provide effective feedback to credit evaluators, lending institutions and business analysts for in-depth analysis. Also, they mention to the importance of the precautionary borrowing techniques to be used to better understand credit-card borrowers’ behavior, along with specific accounting, historical and demographical characteristics.

[1]  Ilyes Jenhani,et al.  Decision trees as possibilistic classifiers , 2008, Int. J. Approx. Reason..

[2]  I-Cheng Yeh,et al.  The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients , 2009, Expert Syst. Appl..

[3]  Alexander Gammerman,et al.  Machine-learning algorithms for credit-card applications , 1992 .

[4]  Agma J. M. Traina,et al.  SACMiner: A New Classification Method Based on Statistical Association Rules to Mine Medical Images , 2010, ICEIS.

[5]  Ian H. Witten,et al.  Generating Accurate Rule Sets Without Global Optimization , 1998, ICML.

[6]  V. Srinivasan,et al.  Credit Granting: A Comparative Analysis of Classification Procedures , 1987 .

[7]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[8]  Paola Sebastiani,et al.  Robust Bayes classifiers , 2001, Artif. Intell..

[9]  D. Hand,et al.  A k-nearest-neighbour classifier for assessing consumer credit risk , 1996 .

[10]  Shomona Gracia Jacob,et al.  Prediction of Credit-Card Defaulters: A Comparative Study on Performance of Classifiers , 2016 .

[11]  S. E. Papadakis,et al.  Evaluation of empirical attributes for credit risk forecasting from numerical data , 2015 .

[12]  Yong Shi,et al.  Classifications Of Credit Cardholder Behavior By Using Fuzzy Linear Programming , 2004, Int. J. Inf. Technol. Decis. Mak..

[13]  David W. Aha,et al.  Tolerating Noisy, Irrelevant and Novel Attributes in Instance-Based Learning Algorithms , 1992, Int. J. Man Mach. Stud..

[14]  Shichao Zhang,et al.  kNN Algorithm with Data-Driven k Value , 2014, ADMA.

[15]  Antariksha Bhaduri Credit scoring using Artificial Immune System algorithms: A comparative study , 2009, 2009 World Congress on Nature & Biologically Inspired Computing (NaBIC).

[16]  Zhengxin Chen,et al.  Cross-Validation and Ensemble Analyses on Multiple-Criteria Linear Programming Classification for Credit Cardholder Behavior , 2004, International Conference on Computational Science.

[17]  Eibe Frank,et al.  A Simple Approach to Ordinal Classification , 2001, ECML.

[18]  Aida Krichene,et al.  Using a Naive Bayesian Classifier Methodology for Loan Risk Assessment: Evidence from a Tunisian Commercial Bank , 2017 .

[19]  Shigeyuki Hamori,et al.  Ensemble Learning or Deep Learning? Application to Default Risk Analysis , 2018 .

[20]  A. Lo,et al.  Consumer Credit Risk Models Via Machine-Learning Algorithms , 2010 .

[21]  M. Stone Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[22]  Enes Makalic,et al.  Review of Modern Logistic Regression Methods with Application to Small and Medium Sample Size Problems , 2010, Australasian Conference on Artificial Intelligence.

[23]  Eibe Frank,et al.  Logistic Model Trees , 2003, ECML.

[24]  Magdalene Marinaki,et al.  Ant colony and particle swarm optimization for financial classification problems , 2009, Expert Syst. Appl..

[25]  Aihua Shen,et al.  Application of Classification Models on Credit Card Fraud Detection , 2007, 2007 International Conference on Service Systems and Service Management.

[26]  R. Geetha Ramani,et al.  Discovery of Knowledge Patterns in Clinical Data through Data Mining Algorithms: Multi-class Categorization of Breast Tissue Data , 2011 .

[27]  Hava T. Siegelmann,et al.  Support Vector Clustering , 2002, J. Mach. Learn. Res..

[28]  Tian-Shyug Lee,et al.  Mining the customer credit using classification and regression tree and multivariate adaptive regression splines , 2006, Comput. Stat. Data Anal..

[29]  Zhengxin Chen,et al.  Classifying Credit Card Accounts for Business Intelligence and Decision Making: a Multiple-criteria Quadratic Programming Approach , 2005, Int. J. Inf. Technol. Decis. Mak..