Multiple classifier application to credit risk assessment

Credit risk prediction models seek to predict quality factors such as whether an individual will default (bad applicant) on a loan or not (good applicant). This can be treated as a kind of machine learning (ML) problem. Recently, the use of ML algorithms has proven to be of great practical value in solving a variety of risk problems including credit risk prediction. One of the most active areas of recent research in ML has been the use of ensemble (combining) classifiers. Research indicates that ensemble individual classifiers lead to a significant improvement in classification performance by having them vote for the most popular class. This paper explores the predicted behaviour of five classifiers for different types of noise in terms of credit risk prediction accuracy, and how such accuracy could be improved by using classifier ensembles. Benchmarking results on four credit datasets and comparison with the performance of each individual classifier on predictive accuracy at various attribute noise levels are presented. The experimental evaluation shows that the ensemble of classifiers technique has the potential to improve prediction accuracy.

[1]  R. Malhotra,et al.  Evaluating Consumer Loans Using Neural Networks , 2001 .

[2]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[3]  J. Wiginton A Note on the Comparison of Logit and Discriminant Models of Consumer Credit Behavior , 1980, Journal of Financial and Quantitative Analysis.

[4]  Shinichi Morishita,et al.  On Classification and Regression , 1998, Discovery Science.

[5]  Eric Bauer,et al.  An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants , 1999, Machine Learning.

[6]  Harris Drucker,et al.  Boosting and Other Ensemble Methods , 1994, Neural Computation.

[7]  David W. Aha,et al.  Tolerating Noisy, Irrelevant and Novel Attributes in Instance-Based Learning Algorithms , 1992, Int. J. Man Mach. Stud..

[8]  David J. Hand,et al.  Good methods for coping with missing data in decision trees , 2008, Pattern Recognit. Lett..

[9]  H. Frydman,et al.  Introducing Recursive Partitioning for Financial Classification: The Case of Financial Distress , 1985 .

[10]  David J. Hand,et al.  Choosing k for two-class nearest neighbour classifiers with unbalanced classes , 2003, Pattern Recognit. Lett..

[11]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[12]  William G. Cochran,et al.  Experimental Designs, 2nd Edition , 1950 .

[13]  R. Malhotra,et al.  Evaluating Consumer Loans using Neural Networks , 2003 .

[14]  Bart BaesensRudy Using Neural Network Rule Extraction and Decision Tables for Credit-Risk Evaluation , 2003 .

[15]  A. Abdel-khalik,et al.  Information Choice and Utilization in an Experiment on Default Prediction , 1980 .

[16]  D. Hand,et al.  A k-nearest-neighbour classifier for assessing consumer credit risk , 1996 .

[17]  Samprit Chatterjee,et al.  A Nonparametric Approach to Credit Screening , 1970 .

[18]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[19]  William Edward Henley,et al.  Statistical aspects of credit scoring , 1995 .

[20]  Igor Kononenko,et al.  Semi-Naive Bayesian Classifier , 1991, EWSL.

[21]  Alexander Gammerman,et al.  Machine-learning algorithms for credit-card applications , 1992 .

[22]  Eric Rosenberg,et al.  Quantitative Methods in Credit Management: A Survey , 1994, Oper. Res..

[23]  Donald E. Neumann An Enhanced Neural Network Technique for Software Risk Analysis , 2002, IEEE Trans. Software Eng..

[24]  David W. Aha,et al.  Instance-Based Learning Algorithms , 1991, Machine Learning.

[25]  Vijay S. Desai,et al.  A comparison of neural networks and linear scoring models in the credit union environment , 1996 .

[26]  C. Holmes,et al.  A probabilistic nearest neighbour method for statistical pattern recognition , 2002 .

[27]  Yoshua Bengio,et al.  Pattern Recognition and Neural Networks , 1995 .

[28]  Edward I. Altman,et al.  FINANCIAL RATIOS, DISCRIMINANT ANALYSIS AND THE PREDICTION OF CORPORATE BANKRUPTCY , 1968 .

[29]  E. Mine Cinar,et al.  Neural Networks: A New Tool for Predicting Thrift Failures , 1992 .

[30]  K. Leonard Empirical Bayes analysis of the commercial loan evaluation process , 1993 .

[31]  Bhekisipho Twala,et al.  Effective techniques for handling incomplete data using decision trees , 2005 .

[32]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[33]  Robert E. Schapire,et al.  The strength of weak learnability , 1990, Mach. Learn..

[34]  David J. Hand,et al.  Statistical Classification Methods in Consumer Credit Scoring: a Review , 1997 .

[35]  Majid Ahmadi,et al.  Investigating the Performance of Naive- Bayes Classifiers and K- Nearest Neighbor Classifiers , 2010, 2007 International Conference on Convergence Information Technology (ICCIT 2007).

[36]  Pamela K. Coats,et al.  A neural network for classifying the financial health of a firm , 1995 .

[37]  Terence C. Fogarty,et al.  Evolving Bayesian classifiers for credit control—a comparison with other machine-learning methods , 1993 .

[38]  Joseph L Schafer,et al.  Analysis of Incomplete Multivariate Data , 1997 .

[39]  Thomas G. Dietterich An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization , 2000, Machine Learning.

[40]  Chris Carter,et al.  Assessing Credit Card Applications Using Machine Learning , 1987, IEEE Expert.

[41]  Selwyn Piramuthu,et al.  Financial credit-risk evaluation with neural and neurofuzzy systems , 1999, Eur. J. Oper. Res..

[42]  Herbert L. Jensen,et al.  Using Neural Networks for Credit Scoring , 1992 .

[43]  Jiri Matas,et al.  On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[44]  C ONG,et al.  Building credit scoring models using genetic programming , 2005, Expert Syst. Appl..

[45]  J. Crook,et al.  Credit scoring using neural and evolutionary techniques , 2000 .

[46]  Dave Feldman,et al.  Mortgage Default: Classification Trees Analysis , 2004 .

[47]  Melody Y. Kiang,et al.  Managerial Applications of Neural Networks: The Case of Bank Failure Predictions , 1992 .

[48]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[49]  David West,et al.  Neural network credit scoring models , 2000, Comput. Oper. Res..

[50]  Jonathan Crook,et al.  Credit Scoring Models in the Credit Union Environment Using Neural Networks and Genetic Algorithms , 1997 .

[51]  Barbara M. Byrne,et al.  Structural equation modeling with EQS : basic concepts, applications, and programming , 2000 .

[52]  Brian D. Ripley,et al.  Neural Networks and Related Methods for Classification , 1994 .

[53]  G DietterichThomas An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees , 2000 .

[54]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .