A Deep Learning Approach Using DeepGBM for Credit Assessment

In the loan business, the bank needs to conduct credit assessment on customers to reduce the loan risk. How to assess personal credit has become a problem which is worth studying. In the traditional credit assessment methods, logistic regression, decision tree, random forest, and other methods were often used to conduct credit assessment for individuals. In recent years, a new machine learning method, LightGBM [1] has also been used in credit assessment and achieved good results. In the models mentioned above, the problem of sparse categorical features and dense numerical features of the credit assessment data set is not solved yet. DeepGBM[2] proposed by Guolin Ke, Zhenhui Xu* and Jia Zhang can solve the problem of credit assessment data set very well. Therefore, our research adopted the latest deep learning framework DeepGBM. The deep learning framework of DeepGBM consists of two parts, CatNN, and GBDT2NN, which are used to deal with sparse categorical features and dense numerical features, respectively. This paper used a data set from Kaggle: Home Credit Default Risk. We had conducted several different experimental methods on this data set. The final results of these experiments demonstrate that the performance of DeepGBM is better than other models.

[1]  K. I. Ramachandran,et al.  Feature selection using Decision Tree and classification through Proximal Support Vector Machine for fault diagnostics of roller bearing , 2007 .

[2]  Jonathan N. Crook,et al.  Recent developments in consumer credit risk assessment , 2007, Eur. J. Oper. Res..

[3]  Dean Fantazzini,et al.  Random Survival Forests Models for SME Credit Risk Measurement , 2009 .

[4]  Charles X. Ling,et al.  Using AUC and accuracy in evaluating learning algorithms , 2005, IEEE Transactions on Knowledge and Data Engineering.

[5]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[6]  Ping Li,et al.  Robust LogitBoost and Adaptive Base Class (ABC) LogitBoost , 2010, UAI.

[7]  Donald F. Specht,et al.  A general regression neural network , 1991, IEEE Trans. Neural Networks.

[8]  Tie-Yan Liu,et al.  DeepGBM: A Deep Learning Framework Distilled by GBDT for Online Prediction Tasks , 2019, KDD.

[9]  D. Hand,et al.  A k-nearest-neighbour classifier for assessing consumer credit risk , 1996 .

[10]  Yufei Xia,et al.  A boosted decision tree approach using Bayesian hyper-parameter optimization for credit scoring , 2017, Expert Syst. Appl..

[11]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[12]  Yong Shi,et al.  Credit card churn forecasting by logistic regression and decision tree , 2011, Expert Syst. Appl..

[13]  Alan K. Reichert,et al.  An Examination of the Conceptual Issues Involved in Developing Credit-Scoring Models , 1983 .

[14]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[15]  Christophe Mues,et al.  An experimental comparison of classification algorithms for imbalanced credit scoring data sets , 2012, Expert Syst. Appl..

[16]  Edward I. Altman,et al.  FINANCIAL RATIOS, DISCRIMINANT ANALYSIS AND THE PREDICTION OF CORPORATE BANKRUPTCY , 1968 .

[17]  R. Real,et al.  AUC: a misleading measure of the performance of predictive distribution models , 2008 .

[18]  Donald F. Specht,et al.  Probabilistic neural networks , 1990, Neural Networks.

[19]  Tie-Yan Liu,et al.  LightGBM: A Highly Efficient Gradient Boosting Decision Tree , 2017, NIPS.

[20]  Norbert Jankowski,et al.  Feature selection with decision tree criterion , 2005, Fifth International Conference on Hybrid Intelligent Systems (HIS'05).

[21]  Christopher J. C. Burges,et al.  From RankNet to LambdaRank to LambdaMART: An Overview , 2010 .

[22]  J. Suykens,et al.  Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research , 2015, Eur. J. Oper. Res..