Establishing decision tree-based short-term default credit risk assessment models

ABSTRACT Traditional credit risk assessment models do not consider the time factor; they only think of whether a customer will default, but not the when to default. The result cannot provide a manager to make the profit-maximum decision. Actually, even if a customer defaults, the financial institution still can gain profit in some conditions. Nowadays, most research applied the Cox proportional hazards model into their credit scoring models, predicting the time when a customer is most likely to default, to solve the credit risk assessment problem. However, in order to fully utilize the fully dynamic capability of the Cox proportional hazards model, time-varying macroeconomic variables are required which involve more advanced data collection. Since short-term default cases are the ones that bring a great loss for a financial institution, instead of predicting when a loan will default, a loan manager is more interested in identifying those applications which may default within a short period of time when approving loan applications. This paper proposes a decision tree-based short-term default credit risk assessment model to assess the credit risk. The goal is to use the decision tree to filter the short-term default to produce a highly accurate model that could distinguish default lending. This paper integrates bootstrap aggregating (Bagging) with a synthetic minority over-sampling technique (SMOTE) into the credit risk model to improve the decision tree stability and its performance on unbalanced data. Finally, a real case of small and medium enterprise loan data that has been drawn from a local financial institution located in Taiwan is presented to further illustrate the proposed approach. After comparing the result that was obtained from the proposed approach with the logistic regression and Cox proportional hazards models, it was found that the classifying recall rate and precision rate of the proposed model was obviously superior to the logistic regression and Cox proportional hazards models.

[1]  Eivind Bernhardsen,et al.  Model for Analysing Credit Risk in the Enterprise Sector , 2001 .

[2]  Diana Bonfim Credit Risk Drivers: Evaluating the Contribution of Firm Level Information and of Macroeconomic Dynamics , 2009 .

[3]  L. Thomas A survey of credit and behavioural scoring: forecasting financial risk of lending to consumers , 2000 .

[4]  Tyler Shumway Forecasting Bankruptcy More Accurately: A Simple Hazard Model , 1999 .

[5]  Christophe Mues,et al.  Mixture cure models in credit scoring: If and when borrowers default , 2012, Eur. J. Oper. Res..

[6]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[7]  Ke Wang,et al.  Multi-Period Corporate Default Prediction with Stochastic Covariates , 2005 .

[8]  Loretta J. Mester What's the point of credit scoring? , 1997 .

[9]  D.,et al.  Regression Models and Life-Tables , 2022 .

[10]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[11]  J. R. Quinlan Constructing Decision Trees , 1993 .

[12]  Edward I. Altman,et al.  FINANCIAL RATIOS, DISCRIMINANT ANALYSIS AND THE PREDICTION OF CORPORATE BANKRUPTCY , 1968 .

[13]  John Banasik,et al.  Not if but when will borrowers default , 1999, J. Oper. Res. Soc..

[14]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[15]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[16]  Bart Baesens,et al.  Neural network survival analysis for personal loan data , 2005, J. Oper. Res. Soc..

[17]  Kasper Roszbach,et al.  Bank Lending Policy, Credit Scoring, and the Survival of Loans , 2004, Review of Economics and Statistics.