The individual borrowers recognition: Single and ensemble trees

Banks provide a financial intermediary service by channeling funds efficiently between borrowers and lenders. Bank lending is subject to credit risk when loans are not paid back on a timely basis or are in default. The ability or possessing a methodology to evaluate the creditworthiness of a borrower is therefore crucial to managing the bank's risk management and profitability. The aim of the paper is dichotomous classification of the individual borrowers to the groups of creditworthy or non-creditworthy clients. The recognition of borrowers is provided applying single and aggregated classification trees. Classification trees are a powerful alternative to the more traditional statistical models. This model has the advantage of being able to detect non-linear relationships and showing a good performance in presence of qualitative information as it happens in the creditworthiness evaluation of individual borrowers. As a result, they are widely used as base classifiers for ensemble methods. Aggregated classification trees are constructed employing two ensemble methods: Adaboost and bagging. AdaBoost constructs its base classifiers in sequence, updating a distribution over the training examples to create each base classifier. Bagging combines the individual classifiers built in bootstrap replicates of the training set. The research is conducted employing actual data regarding the individual borrowers that got a mortgage credit in one of the commercial banks that operate in Poland. Each of the clients is described by 11 variables. The grouping variable informs if the client pays off the credit regularly due to the credit agreement or he is back in loan redemption. Diagnostic variables describe the clients in terms of demographic features and characterize the credits that are to be paid back (i.e. value and currency of the credit, credit rate, etc.).

[1]  W. Loh,et al.  SPLIT SELECTION METHODS FOR CLASSIFICATION TREES , 1997 .

[2]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[3]  William Edward Henley,et al.  Statistical aspects of credit scoring , 1995 .

[4]  Leszek Rutkowski,et al.  Neural Networks and Soft Computing , 2003 .

[5]  Harris Drucker,et al.  The Boosting Approach to Machine Learning An Overview , 2003 .

[6]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[7]  Corinna Cortes,et al.  Boosting Decision Trees , 1995, NIPS.

[8]  Markus Leippold,et al.  Economic Benefit of Powerful Credit Scoring , 2005 .

[9]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[10]  Alexander Gammerman,et al.  Machine-learning algorithms for credit-card applications , 1992 .

[11]  C ONG,et al.  Building credit scoring models using genetic programming , 2005, Expert Syst. Appl..

[12]  Vijay S. Desai,et al.  A comparison of neural networks and linear scoring models in the credit union environment , 1996 .

[13]  Alan K. Reichert,et al.  An Examination of the Conceptual Issues Involved in Developing Credit-Scoring Models , 1983 .

[14]  Christophe Croux,et al.  Bagging and Boosting Classification Trees to Predict Churn , 2006 .

[15]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[16]  Eric Bauer,et al.  An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants , 1999, Machine Learning.

[17]  Jonathan N. Crook,et al.  Recent developments in consumer credit risk assessment , 2007, Eur. J. Oper. Res..

[18]  S. T. Buckland,et al.  An Introduction to the Bootstrap. , 1994 .

[19]  Sheng-Tun Li,et al.  The evaluation of consumer loans using support vector machines , 2006, Expert Syst. Appl..

[20]  R. Schapire The Strength of Weak Learnability , 1990, Machine Learning.

[21]  Johan A. K. Suykens,et al.  Benchmarking state-of-the-art classification algorithms for credit scoring , 2003, J. Oper. Res. Soc..

[22]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[23]  Tian-Shyug Lee,et al.  Mining the customer credit using classification and regression tree and multivariate adaptive regression splines , 2006, Comput. Stat. Data Anal..

[24]  Wladyslaw Kaminski,et al.  NEURAL NETWORKS AS A SUPPORTING TOOL IN CREDIT GRANTING PROCEDURE , 2004 .

[25]  Lawrence O. Hall,et al.  A Comparison of Ensemble Creation Techniques , 2004, Multiple Classifier Systems.

[26]  Rashmi Malhotra,et al.  Differentiating between Good Credits and Bad Credits Using Neuro-Fuzzy Systems , 2001, Eur. J. Oper. Res..

[27]  D. Hand,et al.  A k-nearest-neighbour classifier for assessing consumer credit risk , 1996 .

[28]  Matías Gámez,et al.  A boosting approach for corporate failure prediction , 2007, Applied Intelligence.

[29]  D. Opitz,et al.  Popular Ensemble Methods: An Empirical Study , 1999, J. Artif. Intell. Res..

[30]  Iwona Staniec,et al.  Credit Granting Procedure: Multilayer Perceptron and Classification Tree , 2003 .

[31]  Mu-Chen Chen,et al.  Credit scoring with a data mining approach based on support vector machines , 2007, Expert Syst. Appl..

[32]  David West,et al.  Neural network credit scoring models , 2000, Comput. Oper. Res..

[33]  Yoav Freund,et al.  Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.