Weighted SMOTE-Ensemble Algorithms: Evidence from Chinese Imbalance Credit Approval Instances

The current study proposes a novel ensemble approach rooted in the weighted synthetic minority over-sampling technique (WSMOTE) algorithm being called WSMOTE-ensemble for skewed loan performance data modeling. The proposed ensemble classifier hybridizes WSMOTE and Bagging with sampling composite mixtures (SCMs) to minimize the class skewed constraints linking to the positive and negative small business instances. It increases the multiplicity of executed algorithms as different sampling composite mixtures are applied to form diverse training sets. Based on the fitted evaluation measures, finally this study recommends that the 'WSMOTE-ensemblek-NN' methodology generating from the WSMOTE-decision tree-bagging with k nearest neighbor is the best fusion sampling strategy which is a novel finding in this domain.

[1]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[2]  T. Jayanthi,et al.  Weighted-SMOTE: A modification to SMOTE for event classification in sodium cooled fast reactors , 2017 .

[3]  Bernardete Ribeiro,et al.  Probabilistic modeling and visualization for bankruptcy prediction , 2017, Appl. Soft Comput..

[4]  Francisco Herrera,et al.  A Review on Ensembles for the Class Imbalance Problem: Bagging-, Boosting-, and Hybrid-Based Approaches , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[5]  Abdulhamit Subasi,et al.  Credit scoring for a microcredit data set using the synthetic minority oversampling technique and ensemble classifiers , 2018, Expert Syst. J. Knowl. Eng..

[6]  Salvatore J. Stolfo,et al.  Toward Scalable Learning with Non-Uniform Class and Cost Distributions: A Case Study in Credit Card Fraud Detection , 1998, KDD.

[7]  Francisco Louzada,et al.  On the impact of disproportional samples in credit scoring models: An application to a Brazilian bank data , 2012, Expert Syst. Appl..

[8]  Hamido Fujita,et al.  Imbalanced enterprise credit evaluation with DTE-SBD: Decision tree ensemble based on SMOTE and bagging with differentiated sampling rates , 2018, Inf. Sci..

[9]  Dae-Ki Kang,et al.  Geometric mean based boosting algorithm with over-sampling to resolve data imbalance problem for bankruptcy prediction , 2015, Expert Syst. Appl..

[10]  José Salvador Sánchez,et al.  On the suitability of resampling techniques for the class imbalance problem in credit scoring , 2013, J. Oper. Res. Soc..