A New Credit Scoring Method Based on Rough Sets and Decision Tree

Credit scoring is a very typical classification problem in Data Mining. Many classification methods have been presented in the literatures to tackle this problem. The decision tree method is a particularly effective method to build a classifier from the sample data. Decision tree classification method has higher prediction accuracy for the problems of classification, and can automatically generate classification rules. However, the original sample data sets used to generate the decision tree classification model often contain many noise or redundant data. These data will have a great impact on the prediction accuracy of the classifier. Therefore, it is necessary and very important to preprocess the original sample data. On this issue, a very effective approach is the rough sets. In rough sets theory, a basic problem that can be tackled using rough sets approach is reduction of redundant attributes. This paper presents a new credit scoring approach based on combination of rough sets theory and decision tree theory. The results of this study indicate that the process of reduction of attribute is very effective and our approach has good performance in terms of prediction accuracy.

[1]  D. Hand,et al.  A k-nearest-neighbour classifier for assessing consumer credit risk , 1996 .

[2]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[3]  Andrzej Skowron,et al.  The Discernibility Matrices and Functions in Information Systems , 1992, Intelligent Decision Support.

[4]  Mehmed Kantardzic,et al.  Data Mining: Concepts, Models, Methods, and Algorithms , 2002 .

[5]  R. Słowiński Intelligent Decision Support: Handbook of Applications and Advances of the Rough Sets Theory , 1992 .

[6]  Lior Rokach,et al.  An Introduction to Decision Trees , 2007 .

[7]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[8]  A. G. Hamilton Logic for Mathematicians , 1978 .

[9]  Qinghua Hu,et al.  Consistency Based Attribute Reduction , 2007, PAKDD.

[10]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..

[11]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[12]  Wang Yuan A Fast AIgorithm for Reduction Based on Skowron Discernibility Matrix , 2005 .

[13]  Mu-Chen Chen,et al.  Credit scoring with a data mining approach based on support vector machines , 2007, Expert Syst. Appl..

[14]  Vijay S. Desai,et al.  A comparison of neural networks and linear scoring models in the credit union environment , 1996 .

[15]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[16]  William Edward Henley,et al.  Statistical aspects of credit scoring , 1995 .

[17]  J. Barkley Rosser Logic for mathematicians / J. Barkley Rosser , 1953 .

[18]  D. J. Newman,et al.  UCI Repository of Machine Learning Database , 1998 .