A Novel GSCI-Based Ensemble Approach for Credit Scoring

Credit scoring is an efficient tool for financial institutions to implement credit risk management. In recent years, many novel machine learning models have been developed for credit scoring. Among the existing machine learning models, the heterogeneous ensemble model receives much attention because of its superior performance. This paper presents a new heterogeneous ensemble model based on the generalized Shapley value and the Choquet integral. To do this, the model first uses the fuzzy measure to express the interactive characteristics between any two coalitions of base learners. Based on the accuracy and diversity objective function, a linear programming model for determining the fuzzy measure is built. To retain the original information as much as possible in the training stage, the normal fuzzy number is employed to express the base learner predicted values. Then, the generalized Shapley Choquet integral (GSCI) aggregation operator is defined to calculate the comprehensive predicted value of the ensemble model. Based on the defined aggregation operator and linear programming model, a GSCI approach is proposed for ensemble credit scoring. To illustrate the efficiency and feasibility of the GSCI approach, an experiment of thirteen machine learning models over four public credit scoring datasets and three real-world P2P leading datasets with large volumes of samples is made. Furthermore, robust tests and comparatives analysis are made to demonstrate the adaptability and performance of the GSCI-based ensemble model.

[1]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[2]  Peter A. Flach,et al.  Brier Curves: a New Cost-Based Visualisation of Classifier Performance , 2011, ICML.

[3]  J. Suykens,et al.  Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research , 2015, Eur. J. Oper. Res..

[4]  Sebastián Maldonado,et al.  Integrated framework for profit-based feature selection and SVM classification in credit scoring , 2017, Decis. Support Syst..

[5]  Shuai Zhang,et al.  A Novel Noise-Adapted Two-Layer Ensemble Model for Credit Scoring Based on Backflow Learning , 2019, IEEE Access.

[6]  Sotiris B. Kotsiantis,et al.  Decision trees: a recent overview , 2011, Artificial Intelligence Review.

[7]  Kin Keung Lai,et al.  Least squares support vector machines ensemble models for credit scoring , 2010, Expert Syst. Appl..

[8]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[10]  Hamido Fujita,et al.  Imbalanced enterprise credit evaluation with DTE-SBD: Decision tree ensemble based on SMOTE and bagging with differentiated sampling rates , 2018, Inf. Sci..

[11]  Jean-Luc Marichal,et al.  The influence of variables on pseudo-Boolean functions with applications to game theory and multicriteria decision making , 2000, Discret. Appl. Math..

[12]  Anne M. P. Canuto,et al.  Investigating the influence of the choice of the ensemble members in accuracy and diversity of selection-based and fusion-based methods for ensembles , 2007, Pattern Recognit. Lett..

[13]  Marian B. Gorzalczany,et al.  A multi-objective genetic optimization for fast, fuzzy rule-based credit classification with balanced accuracy and interpretability , 2016, Appl. Soft Comput..

[14]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[15]  Mohamed S. Kamel,et al.  Intelligent information fusion approach in cooperative multiagent systems , 2002, Proceedings of the 5th Biannual World Automation Congress.

[16]  Damodar Reddy Edla,et al.  Hybrid credit scoring model using neighborhood rough set and multi-layer ensemble classification , 2018, J. Intell. Fuzzy Syst..

[17]  Peng Zhang,et al.  Credit scoring using ensemble classification based on variable weighting clustering , 2017, 2017 IEEE 21st International Conference on Computer Supported Cooperative Work in Design (CSCWD).

[18]  Paulius Danenas,et al.  Selection of Support Vector Machines based classifiers for credit risk domain , 2015, Expert Syst. Appl..

[19]  Fanyong Meng,et al.  A hesitant fuzzy linguistic multi-granularity decision making model based on distance measures , 2015, J. Intell. Fuzzy Syst..

[20]  Qiang Zhang,et al.  Approaches to multiple-criteria group decision making based on interval-valued intuitionistic fuzzy Choquet integral with respect to the generalized λ-Shapley index , 2013, Knowl. Based Syst..

[21]  菅野 道夫,et al.  Theory of fuzzy integrals and its applications , 1975 .

[22]  Shanshan Guo,et al.  A Multi-Stage Self-Adaptive Classifier Ensemble Model With Application in Credit Scoring , 2019, IEEE Access.

[23]  Miin-Shen Yang,et al.  On a class of fuzzy c-numbers clustering procedures for fuzzy data , 1996, Fuzzy Sets Syst..

[24]  Pierre Geurts,et al.  Extremely randomized trees , 2006, Machine Learning.

[25]  Chih-Fong Tsai,et al.  Combining cluster analysis with classifier ensembles to predict financial distress , 2014, Inf. Fusion.

[26]  L. Shapley A Value for n-person Games , 1988 .

[27]  Carlos Serrano-Cinca,et al.  The use of profit scoring as an alternative to credit scoring systems in peer-to-peer (P2P) lending , 2016, Decis. Support Syst..

[28]  Sankaran Mahadevan,et al.  Ensemble machine learning models for aviation incident risk prediction , 2019, Decis. Support Syst..

[29]  Ronald R. Yager,et al.  On ordered weighted averaging aggregation operators in multicriteria decisionmaking , 1988, IEEE Trans. Syst. Man Cybern..

[30]  Feng-Chia Li,et al.  Combination of feature selection approaches with SVM in credit scoring , 2010, Expert Syst. Appl..

[31]  郑肇葆,et al.  基于Naive Bayes Classifiers的航空影像纹理分类 , 2006 .

[32]  Edward I. Altman,et al.  FINANCIAL RATIOS, DISCRIMINANT ANALYSIS AND THE PREDICTION OF CORPORATE BANKRUPTCY , 1968 .

[33]  Damodar Reddy Edla,et al.  An Efficient Multi-layer Ensemble Framework with BPSOGSA-Based Feature Selection for Credit Scoring Data Analysis , 2018 .

[34]  David C. Yen,et al.  A comparative study of classifier ensembles for bankruptcy prediction , 2014, Appl. Soft Comput..

[35]  Kin Keung Lai,et al.  An intelligent-agent-based fuzzy group decision making model for financial multicriteria decision support: The case of credit scoring , 2009, Eur. J. Oper. Res..

[36]  Patrick Meyer,et al.  On the use of the Choquet integral with fuzzy numbers in multiple criteria decision support , 2006, Fuzzy Sets Syst..

[37]  Cynthia Rudin,et al.  Machine learning for science and society , 2013, Machine Learning.

[38]  Fanyong Meng,et al.  Interval-valued intuitionistic fuzzy multi-criteria group decision making based on cross entropy and 2-additive measures , 2015, Soft Comput..

[39]  Petr Hájek,et al.  Two-stage consumer credit risk modelling using heterogeneous ensemble learning , 2019, Decis. Support Syst..

[40]  Jian Ma,et al.  A comparative assessment of ensemble learning for credit scoring , 2011, Expert Syst. Appl..

[41]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[42]  David West,et al.  Neural network credit scoring models , 2000, Comput. Oper. Res..

[43]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[44]  M. Grabisch The application of fuzzy integrals in multicriteria decision making , 1996 .

[45]  Damodar Reddy Edla,et al.  Credit Scoring Model based on Weighted Voting and Cluster based Feature Selection , 2018 .

[46]  Johan A. K. Suykens,et al.  Benchmarking state-of-the-art classification algorithms for credit scoring , 2003, J. Oper. Res. Soc..

[47]  Hong-yu Zhang,et al.  A score function based on relative entropy and its application in intuitionistic normal fuzzy multiple criteria decision making , 2013, J. Intell. Fuzzy Syst..

[48]  Shuai Zhang,et al.  A novel ensemble method for credit scoring: Adaption of different imbalance ratios , 2018, Expert Syst. Appl..

[49]  J. Freidman,et al.  Multivariate adaptive regression splines , 1991 .

[50]  Vadlamani Ravi,et al.  Bankruptcy prediction in banks and firms via statistical and intelligent techniques - A review , 2007, Eur. J. Oper. Res..

[51]  Maysam F. Abbod,et al.  Classifiers consensus system approach for credit scoring , 2016, Knowl. Based Syst..

[52]  Fanyong Meng,et al.  Induced generalized hesitant fuzzy Shapley hybrid operators and their application in multi-attribute decision making , 2015, Appl. Soft Comput..

[53]  J. Wiginton A Note on the Comparison of Logit and Discriminant Models of Consumer Credit Behavior , 1980, Journal of Financial and Quantitative Analysis.

[54]  Vural Aksakalli,et al.  Risk assessment in social lending via random forests , 2015, Expert Syst. Appl..

[55]  Tian-Shyug Lee,et al.  Mining the customer credit using classification and regression tree and multivariate adaptive regression splines , 2006, Comput. Stat. Data Anal..

[56]  Maysam F. Abbod,et al.  A new hybrid ensemble credit scoring model based on classifiers consensus system approach , 2016, Expert Syst. Appl..

[57]  David J. Hand,et al.  Statistical Classification Methods in Consumer Credit Scoring: a Review , 1997 .

[58]  Yu Wang,et al.  Ensemble classification based on supervised clustering for credit scoring , 2016, Appl. Soft Comput..

[59]  Suyuan Luo,et al.  A Deep Learning Approach for Credit Scoring of Peer-to-Peer Lending Using Attention Mechanism LSTM , 2019, IEEE Access.

[60]  Yufei Xia,et al.  A novel heterogeneous ensemble credit scoring model based on bstacking approach , 2018, Expert Syst. Appl..

[61]  Yi Peng,et al.  FAMCDM: A fusion approach of MCDM methods to rank multiclass classification algorithms , 2011 .

[62]  Fanyong Meng,et al.  Interval‐Valued Intuitionistic Fuzzy Multiattribute Group Decision Making Based on Cross Entropy Measure and Choquet Integral , 2013, Int. J. Intell. Syst..

[63]  Damodar Reddy Edla,et al.  A novel hybrid credit scoring model based on ensemble feature selection and multilayer ensemble classification , 2019, Comput. Intell..

[64]  Licheng Jiao,et al.  Multiobjective sparse ensemble learning by means of evolutionary algorithms , 2018, Decis. Support Syst..

[65]  Jian Ma,et al.  Two credit scoring models based on dual strategy ensemble trees , 2012, Knowl. Based Syst..

[66]  Francisco Javier García Castellano,et al.  Expert Systems With Applications , 2022 .

[67]  Gordon V. Karels,et al.  Multivariate Normality and Forecasting of Business Bankruptcy , 1987 .