Credit Scoring Based on Hybrid Data Mining Classification

The credit scoring has been regarded as a critical topic. This study proposed four approaches combining with the NN (Neural Network) classifier for features selection that retains sufficient information for classification purpose. Two UCI data sets and different approaches combined with NN classifier were constructed by selecting features. NN classifier combines with conventional statistical LDA, Decision tree, Rough set and F-score approaches as features preprocessing step to optimize feature space by removing both irrelevant and redundant features. The procedure of the proposed algorithm is described first and then evaluated by their performances. The results are compared in combination with NN classifier and nonparametric Wilcoxon signed rank test will be held to show if there has any significant difference between these approaches. Our results suggest that hybrid credit scoring models are robust and effective in finding optimal subsets and the compound procedure is a promising method to the fields of data mining.

[1]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[2]  Donald Michie,et al.  Expert systems in the micro-electronic age , 1979 .

[3]  Te-Sheng Li,et al.  FEATURE SELECTION FOR CLASSIFICATION BY USING A GA-BASED NEURAL NETWORK APPROACH , 2006 .

[4]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[5]  Mu-Chen Chen,et al.  Credit scoring with a data mining approach based on support vector machines , 2007, Expert Syst. Appl..

[6]  R. Słowiński Intelligent Decision Support: Handbook of Applications and Advances of the Rough Sets Theory , 1992 .

[7]  Rafael Bello,et al.  Feature Selection Algorithms Using Rough Set Theory , 2007, Seventh International Conference on Intelligent Systems Design and Applications (ISDA 2007).

[8]  Kuriakose Athappilly,et al.  A comparative predictive analysis of neural networks (NNs), nonlinear regression and classification and regression tree (CART) models , 2005, Expert Syst. Appl..

[9]  Yiyu Yao,et al.  Data analysis based on discernibility and indiscernibility , 2007, Inf. Sci..

[10]  J. M. DeLeo,et al.  Essential roles for receiver operating characteristic (ROC) methodology in classifier neural network applications , 2001, IJCNN'01. International Joint Conference on Neural Networks. Proceedings (Cat. No.01CH37222).

[11]  Mineichi Kudo,et al.  Comparison of algorithms that select features for pattern classifiers , 2000, Pattern Recognit..

[12]  Chih-Chou Chiu,et al.  Credit scoring using the hybrid neural discriminant technique , 2002, Expert Syst. Appl..

[13]  Zdzislaw Pawlak,et al.  Rough classification , 1984, Int. J. Hum. Comput. Stud..

[14]  David West,et al.  Neural network credit scoring models , 2000, Comput. Oper. Res..

[15]  Johan A. K. Suykens,et al.  Benchmarking state-of-the-art classification algorithms for credit scoring , 2003, J. Oper. Res. Soc..

[16]  G. David Garson,et al.  Interpreting neural-network connection weights , 1991 .

[17]  Jih-Jeng Huang,et al.  Two-stage genetic programming (2SGP) for the credit scoring model , 2006, Appl. Math. Comput..

[18]  L. Thomas A survey of credit and behavioural scoring: forecasting financial risk of lending to consumers , 2000 .

[19]  C ONG,et al.  Building credit scoring models using genetic programming , 2005, Expert Syst. Appl..

[20]  Huan Liu,et al.  Feature Selection for Classification , 1997, Intell. Data Anal..

[21]  Chien-Hsing Chou,et al.  A prototype classification method and its use in a hybrid solution for multiclass pattern recognition , 2006, Pattern Recognit..

[22]  Andrzej Skowron,et al.  The Discernibility Matrices and Functions in Information Systems , 1992, Intelligent Decision Support.

[23]  Ron Kohavi,et al.  Feature Selection for Knowledge Discovery and Data Mining , 1998 .

[24]  Xiangyang Wang,et al.  Feature selection based on rough sets and particle swarm optimization , 2007, Pattern Recognit. Lett..