Rough modeling - a bottom-up approach to model construction

Traditional data mining methods based on rough set theory focus on extracting models which are good at classifying unseen objects. If one wants to uncover new knowledge from the data, the model must have a high descriptive quality— it must describe the data set in a clear and concise manner, without sacrificing classification performance. Rough modeling, introduced by Kowalczyk (1998), is an approach which aims at providing models with good predictive and descriptive qualities, in addition to being computationally simple enough to handle large data sets. As rough models are flexible in nature and simple to generate, it is possible to generate a large number of models and search through them for the best model. Initial experiments confirm that the drop in performance of rough models compared to models induced using traditional rough set methods is slight at worst, and the gain in descriptive quality is very large.

[1]  Z. Pawlak Rough Sets: Theoretical Aspects of Reasoning about Data , 1991 .

[2]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[3]  S. Hallan,et al.  Additional value of biochemical tests in suspected acute appendicitis. , 1997, The European journal of surgery = Acta chirurgica.

[4]  Janusz Zalewski,et al.  Rough sets: Theoretical aspects of reasoning about data , 1996 .

[5]  Ron Kohavi,et al.  Useful Feature Subsets and Rough Set Reducts , 1994 .

[6]  Willi Klösgen,et al.  Knowledge Discovery in Databases and Data Mining , 1996, ISMIS.

[7]  Andrzej Skowron,et al.  The Discernibility Matrices and Functions in Information Systems , 1992, Intelligent Decision Support.

[8]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[9]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[10]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[11]  Z. Pawlak,et al.  Rough set approach to multi-attribute decision analysis , 1994 .

[12]  Ibrahim F. Imam An Empirical Study on The Incompetence of Attribute Selection Criteria , 1996, ISMIS.

[13]  Ron Kohavi,et al.  The Case against Accuracy Estimation for Comparing Induction Algorithms , 1998, ICML.

[14]  Andrzej Skowron,et al.  Dynamic Reducts as a Tool for Extracting Laws from Decisions Tables , 1994, ISMIS.

[15]  S. Hallan,et al.  Estimating the probability of acute appendicitis using clinical criteria of a structured record sheet: the physician against the computer. , 1997, The European journal of surgery = Acta chirurgica.

[16]  Andrzej Skowron,et al.  Rough-Fuzzy Hybridization: A New Trend in Decision Making , 1999 .

[17]  Ron Kohavi,et al.  Feature Subset Selection Using the Wrapper Method: Overfitting and Dynamic Search Space Topology , 1995, KDD.

[18]  Jan Komorowski,et al.  Taming Large Rule Models in Rough Set Approaches , 1999, PKDD.

[19]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.