Hybrid Harmony Search–Artificial Intelligence Models in Credit Scoring

Credit scoring is an important tool used by financial institutions to correctly identify defaulters and non-defaulters. Support Vector Machines (SVM) and Random Forest (RF) are the Artificial Intelligence techniques that have been attracting interest due to their flexibility to account for various data patterns. Both are black-box models which are sensitive to hyperparameter settings. Feature selection can be performed on SVM to enable explanation with the reduced features, whereas feature importance computed by RF can be used for model explanation. The benefits of accuracy and interpretation allow for significant improvement in the area of credit risk and credit scoring. This paper proposes the use of Harmony Search (HS), to form a hybrid HS-SVM to perform feature selection and hyperparameter tuning simultaneously, and a hybrid HS-RF to tune the hyperparameters. A Modified HS (MHS) is also proposed with the main objective to achieve comparable results as the standard HS with a shorter computational time. MHS consists of four main modifications in the standard HS: (i) Elitism selection during memory consideration instead of random selection, (ii) dynamic exploration and exploitation operators in place of the original static operators, (iii) a self-adjusted bandwidth operator, and (iv) inclusion of additional termination criteria to reach faster convergence. Along with parallel computing, MHS effectively reduces the computational time of the proposed hybrid models. The proposed hybrid models are compared with standard statistical models across three different datasets commonly used in credit scoring studies. The computational results show that MHS-RF is most robust in terms of model performance, model explainability and computational time.

[1]  J. Suykens,et al.  Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research , 2015, Eur. J. Oper. Res..

[2]  Zong Woo Geem,et al.  A New Heuristic Optimization Algorithm: Harmony Search , 2001, Simul..

[3]  Xin-She Yang Harmony Search as a Metaheuristic Algorithm , 2009 .

[4]  Paulius Danenas,et al.  Selection of Support Vector Machines based classifiers for credit risk domain , 2015, Expert Syst. Appl..

[5]  Lingxiao Tang,et al.  Applying a nonparametric random forest algorithm to assess the credit risk of the energy industry in China , 2019, Technological Forecasting and Social Change.

[6]  S. Lahmiri,et al.  Can machine learning approaches predict corporate bankruptcy? Evidence from a qualitative experimental design , 2019, Quantitative Finance.

[7]  Azlan Mohd Zain,et al.  A review of Harmony Search algorithm-based feature selection method for classification , 2019, Journal of Physics: Conference Series.

[8]  Kin Keung Lai,et al.  Credit risk evaluation using a weighted least squares SVM classifier with design of experiment for parameter selection , 2011, Expert Syst. Appl..

[9]  Ching-Chiang Yeh,et al.  A hybrid KMV model, random forests and rough set theory approach for credit rating , 2012, Knowl. Based Syst..

[10]  Robert C. Holte,et al.  C4.5, Class Imbalance, and Cost Sensitivity: Why Under-Sampling beats Over-Sampling , 2003 .

[11]  Kin Keung Lai,et al.  Credit scoring using support vector machines with direct search for parameters selection , 2008, Soft Comput..

[12]  D. Gorter Added value of machine learning in retail credit risk , 2017 .

[13]  Salim Lahmiri,et al.  Performance assessment of ensemble learning systems in financial data classification , 2020, Intell. Syst. Account. Finance Manag..

[14]  Oguzhan Ceylan,et al.  SVM parameter selection based on harmony search with an application to hyperspectral image classification , 2016, 2016 24th Signal Processing and Communication Application Conference (SIU).

[15]  Huimin Zhao,et al.  A prediction-driven mixture cure model and its application in credit scoring , 2019, Eur. J. Oper. Res..

[16]  Vural Aksakalli,et al.  Risk assessment in social lending via random forests , 2015, Expert Syst. Appl..

[17]  K. Sridharan,et al.  Sentiment classification using harmony random forest and harmony gradient boosting machine , 2020, Soft Comput..

[18]  Dalila Boughaci,et al.  Hybrid Harmony Search Combined with Stochastic Local Search for Feature Selection , 2015, Neural Processing Letters.

[19]  Chih-Jen Lin,et al.  A Practical Guide to Support Vector Classication , 2008 .

[20]  Hongmei He,et al.  Information gain directed genetic algorithm wrapper feature selection for credit rating , 2018, Appl. Soft Comput..

[21]  Mu-Yen Chen,et al.  The human-like intelligence with bio-inspired computing approach for credit ratings prediction , 2017, Neurocomputing.

[22]  Gang Kou,et al.  An empirical study of classification algorithm evaluation for financial risk prediction , 2011, Appl. Soft Comput..

[23]  Xin Ye,et al.  Loan evaluation in P2P lending based on Random Forest optimized by genetic algorithm with profit score , 2018, Electron. Commer. Res. Appl..

[24]  Jakub M. Tomczak,et al.  Ensemble boosted trees with synthetic features generation in application to bankruptcy prediction , 2016, Expert Syst. Appl..

[25]  Johan A. K. Suykens,et al.  Benchmarking state-of-the-art classification algorithms for credit scoring , 2003, J. Oper. Res. Soc..

[26]  Mu-Chen Chen,et al.  Credit scoring with a data mining approach based on support vector machines , 2007, Expert Syst. Appl..

[27]  Bart Baesens,et al.  The value of big data for credit scoring: Enhancing financial inclusion using mobile phone data and social network analytics , 2019, Appl. Soft Comput..

[28]  Yin-Fu Huang,et al.  Movie Genre Classification Using SVM with Audio and Video Features , 2012, AMT.

[29]  Shuai Zhang,et al.  A novel ensemble method for credit scoring: Adaption of different imbalance ratios , 2018, Expert Syst. Appl..

[30]  Nguyen Duc Nhan,et al.  A Novel Credit Scoring Prediction Model based on Feature Selection Approach and Parallel Random Forest , 2016 .

[31]  Siddharth Jain,et al.  An improved harmony search algorithm with dynamically varying bandwidth , 2016 .

[32]  Kin Keung Lai,et al.  Credit Scoring Models with AUC Maximization Based on Weighted SVM , 2009, Int. J. Inf. Technol. Decis. Mak..

[33]  Gintautas Garsva,et al.  Particle swarm optimization for linear support vector machines based classifier selection , 2014 .

[34]  Di Wang,et al.  A hybrid system with filter approach and multiple population genetic algorithm for feature selection in credit scoring , 2018, J. Comput. Appl. Math..

[35]  Ajith Abraham,et al.  Population-variance and explorative power of Harmony Search: An analysis , 2008, 2008 Third International Conference on Digital Information Management.

[36]  Lai Soon Lee,et al.  Credit Scoring: A Review on Support Vector Machines and Metaheuristic Approaches , 2019, Adv. Oper. Res..

[37]  Longquan Yong,et al.  A Novel Harmony Search Algorithm Based on Teaching-Learning Strategies for 0-1 Knapsack Problems , 2014, TheScientificWorldJournal.