A Metaheuristic Strategy for Feature Selection Problems: Application to Credit Risk Evaluation in Emerging Markets

As countries develop digital financial infrastructure, a wide range of economic activities expand and grow in importance: from personal loans, to the rapidly developing networked microfinance industry, to mobile telephone services and real estate transactions and so on. Personal credit is also a foundation of trust for facilitation of integrated societal transactions more generally. In emerging markets there is, however, a gap between the requirement for establishing a credit or trust rating and the lack of a credit record. The development of methodologies for greater financial integration of growing economies has the potential to have a significant impact on increasing the GDP of developing economies (4-12% according to a recent McKinsey Global Institute report). In this paper, we develop and test a methodology for feature selection and test its in standard datasets from large institutions in mature market economies, and a recent dataset which illustrates characteristics of emerging markets. The results show performance in classification can be maintained while runtime can be reduced when using a GA for feature selection in a range of machine learning techniques.

[1]  Silvia Casado Yusta,et al.  Different metaheuristic strategies to solve the feature selection problem , 2009, Pattern Recognit. Lett..

[2]  Richard Weber,et al.  A wrapper method for feature selection using Support Vector Machines , 2009, Inf. Sci..

[3]  Xiaoming Xu,et al.  A hybrid genetic algorithm for feature selection wrapper based on mutual information , 2007, Pattern Recognit. Lett..

[4]  Yu Zhong,et al.  An Overview of Personal Credit Scoring: Techniques and Future Work , 2012 .

[5]  Or Biran,et al.  Explanation and Justification in Machine Learning : A Survey Or , 2017 .

[6]  Melody Y. Kiang,et al.  Managerial Applications of Neural Networks: The Case of Bank Failure Predictions , 1992 .

[7]  Ferat Sahin,et al.  A survey on feature selection methods , 2014, Comput. Electr. Eng..

[8]  Xin Yao,et al.  A Survey on Evolutionary Computation Approaches to Feature Selection , 2016, IEEE Transactions on Evolutionary Computation.

[9]  Lawrence Davis,et al.  Training Feedforward Neural Networks Using Genetic Algorithms , 1989, IJCAI.

[10]  Chih-Chou Chiu,et al.  Credit scoring using the hybrid neural discriminant technique , 2002, Expert Syst. Appl..

[11]  ChenFei-Long,et al.  Combination of feature selection approaches with SVM in credit scoring , 2010 .

[13]  Daniel Björkegren,et al.  Behavior Revealed in Mobile Phone Usage Predicts Loan Repayment , 2017, The World Bank Economic Review.

[14]  Han Li-yan,et al.  Credit Scoring Model Hybridizing Artificial Intelligence with Logistic Regression , 2013 .

[15]  Desheng Dash Wu,et al.  A deep learning approach for credit scoring using credit default swaps , 2017, Eng. Appl. Artif. Intell..

[16]  Jure Zupan,et al.  Consumer Credit Scoring Models with Limited Data , 2007, Expert Syst. Appl..

[17]  D. Hand,et al.  A k-nearest-neighbour classifier for assessing consumer credit risk , 1996 .

[18]  Ralf Stecking,et al.  Variable Subset Selection for Credit Scoring with Support Vector Machines , 2005, OR.

[19]  Daphne Koller,et al.  Toward Optimal Feature Selection , 1996, ICML.

[20]  Y. Liu,et al.  Data mining feature selection for credit scoring models , 2005, J. Oper. Res. Soc..

[21]  Deron Liang,et al.  The effect of feature selection on financial distress prediction , 2015, Knowl. Based Syst..

[22]  Edward I. Altman,et al.  Corporate distress diagnosis: Comparisons using linear discriminant analysis and neural networks (the Italian experience) , 1994 .

[23]  Jack Sklansky,et al.  A note on genetic algorithms for large-scale feature selection , 1989, Pattern Recognition Letters.

[24]  Ron Kohavi,et al.  Feature Selection for Knowledge Discovery and Data Mining , 1998 .

[25]  Sabine Van Huffel,et al.  Preoperative prediction of malignancy of ovarian tumors using least squares support vector machines , 2003, Artif. Intell. Medicine.

[26]  Pier Luca Lanzi,et al.  Fast feature selection with genetic algorithms: a filter approach , 1997, Proceedings of 1997 IEEE International Conference on Evolutionary Computation (ICEC '97).

[27]  Jihoon Yang,et al.  Feature Subset Selection Using a Genetic Algorithm , 1998, IEEE Intell. Syst..

[28]  Feng-Chia Li,et al.  Combination of feature selection approaches with SVM in credit scoring , 2010, Expert Syst. Appl..

[29]  Chrysanthos Dellarocas,et al.  Credit Scoring with Social Network Data , 2014, Mark. Sci..

[30]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[31]  John J. Grefenstette,et al.  Optimization of Control Parameters for Genetic Algorithms , 1986, IEEE Transactions on Systems, Man, and Cybernetics.

[32]  KaoLing-Jing,et al.  A Bayesian latent variable model with classification and regression tree approach for behavior and credit scoring , 2012 .

[33]  Lean Yu,et al.  Social credit: a comprehensive literature review , 2015 .

[34]  Franco Varetto Genetic algorithms applications in the analysis of insolvency risk , 1998 .