Comparison of Information Theoretical Measures for Reduct Finding

The paper discusses the properties of an attribute selection criterion for building rough set reducts based on discernibility matrix and compares it with Shannon entropy and Gini index used for building decision trees. It has been shown theoretically and experimentally that entropy and Gini index tend to work better if the reduct is later used for prediction of previously unseen cases, and the criterion based on the discernibility matrix tends to work better for learning functional relationships where generalization is not an issue.

[1]  Jerzy W. Grzymala-Busse,et al.  LERS-A System for Learning from Examples Based on Rough Sets , 1992, Intelligent Decision Support.

[2]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..

[3]  Szymon Jaroszewicz,et al.  Finding reducts without building the discernibility matrix , 2005, 5th International Conference on Intelligent Systems Design and Applications (ISDA'05).

[4]  Jerzy W. Grzymala-Busse LERS - A Data Mining System , 2005, The Data Mining and Knowledge Discovery Handbook.

[5]  Jing Zhang,et al.  A New Heuristic Reduct Algorithm Base on Rough Sets Theory , 2003, WAIM.

[6]  Szymon Jaroszewicz,et al.  A new metric splitting criterion for decision trees , 2006, Int. J. Parallel Emergent Distributed Syst..

[7]  Jan G. Bazan,et al.  Rough set algorithms in classification problem , 2000 .

[8]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[9]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[10]  Janusz Zalewski,et al.  Rough sets: Theoretical aspects of reasoning about data , 1996 .