Assessing naïve Bayes as a method for screening credit applicants

The naïve Bayes rule (NBR) is a popular and often highly effective technique for constructing classification rules. This study examines the effectiveness of NBR as a method for constructing classification rules (credit scorecards) in the context of screening credit applicants (credit scoring). For this purpose, the study uses two real-world credit scoring data sets to benchmark NBR against linear discriminant analysis, logistic regression analysis, k-nearest neighbours, classification trees and neural networks. Of the two aforementioned data sets, the first one is taken from a major Greek bank whereas the second one is the Australian Credit Approval data set taken from the UCI Machine Learning Repository (available at http://www.ics.uci.edu/~mlearn/MLRepository.html). The predictive ability of scorecards is measured by the total percentage of correctly classified cases, the Gini coefficient and the bad rate amongst accepts. In each of the data sets, NBR is found to have a lower predictive ability than some of the other five methods under all measures used. Reasons that may negatively affect the predictive ability of NBR relative to that of alternative methods in the context of credit scoring are examined.

[1]  Igor Kononenko,et al.  Semi-Naive Bayesian Classifier , 1991, EWSL.

[2]  Mark R. Wade,et al.  Construction and Assessment of Classification Rules , 1999, Technometrics.

[3]  Robert P. W. Duin,et al.  A note on comparing classifiers , 1996, Pattern Recognit. Lett..

[4]  Lyn C. Thomas,et al.  Readings in Credit Scoring , 2004 .

[5]  L. Thomas A survey of credit and behavioural scoring: forecasting financial risk of lending to consumers , 2000 .

[6]  Nir Friedman,et al.  Bayesian Network Classifiers , 1997, Machine Learning.

[7]  J. Crook,et al.  Credit scoring using neural and evolutionary techniques , 2000 .

[8]  David J. Hand,et al.  Classifier Technology and the Illusion of Progress , 2006, math/0606441.

[9]  Pedro M. Domingos,et al.  On the Optimality of the Simple Bayesian Classifier under Zero-One Loss , 1997, Machine Learning.

[10]  Niall M. Adams,et al.  Defining attributes for scorecard construction in credit scoring , 2000 .

[11]  Jan Vanthienen,et al.  Learning Bayesian network classifiers for credit scoring using Markov chain Monte Carlo search , 2002, Object recognition supported by user interaction for service robots.

[12]  D. Hand,et al.  Idiot's Bayes—Not So Stupid After All? , 2001 .

[13]  Ron Kohavi,et al.  Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid , 1996, KDD.

[14]  D. Hand,et al.  A k-nearest-neighbour classifier for assessing consumer credit risk , 1996 .

[15]  Niall M. Adams,et al.  Comparing classifiers when the misallocation costs are uncertain , 1999, Pattern Recognit..

[16]  D. Hand Modelling consumer credit risk , 2001 .

[17]  D. J. Hand,et al.  Good practice in retail credit scorecard assessment , 2005, J. Oper. Res. Soc..

[18]  Anthony C. Antonakis,et al.  Naïve Bayes As A Means Of Constructing Application Scorecards , 2008 .

[19]  Jonathan N. Crook,et al.  Credit Scoring and Its Applications , 2002, SIAM monographs on mathematical modeling and computation.

[20]  Mu-Chen Chen,et al.  Credit scoring and rejected instances reassigning through evolutionary computation techniques , 2003, Expert Syst. Appl..

[21]  David J. Hand,et al.  Statistical Classification Methods in Consumer Credit Scoring: a Review , 1997 .