Tolerance-based and Fuzzy-Rough Feature Selection

One of the main obstacles facing the application of computational intelligence technologies in pattern recognition (and indeed in many other tasks) is that of dataset dimensionality. To enable pattern classifiers to be effective, a dimensionality minimization step is usually carried out beforehand. Rough set theory has been successfully applied for this as it requires only the supplied data and no other information; most other methods require supplementary knowledge. However, the main limitation of traditional rough set-based selection in the literature is the restrictive requirement that all data is discrete; it is not possible to consider real-valued or noisy data. This has been tackled previously via the use of discretization methods, but may result in information loss. This paper investigates two approaches based on rough set extensions, namely fuzzy-rough and tolerance rough sets, that address these problems and retain dataset semantics. The methods are compared experimentally and utilized for the task of forensic glass fragment identification.

[1]  S. Lanteri,et al.  Chemometric analysis of Tuscan olive oils , 1989 .

[2]  Z. Pawlak Rough Sets: Theoretical Aspects of Reasoning about Data , 1991 .

[3]  Didier Dubois,et al.  Putting Rough Sets and Fuzzy Sets Together , 1992, Intelligent Decision Support.

[4]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[5]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[6]  D. Vanderpooten Similarity Relation as a Basis for Rough Approximations , 1995 .

[7]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[8]  Christopher J. Merz,et al.  UCI Repository of Machine Learning Databases , 1996 .

[9]  Andrzej Skowron,et al.  Tolerance Approximation Spaces , 1996, Fundam. Informaticae.

[10]  Janusz Zalewski,et al.  Rough sets: Theoretical aspects of reasoning about data , 1996 .

[11]  Huan Liu,et al.  Feature Selection for Classification , 1997, Intell. Data Anal..

[12]  Andrzej Skowron,et al.  Searching for Relational Patterns in Data , 1997, PKDD.

[13]  Salvatore Greco,et al.  Fuzzy Similarity Relation as a Basis for Rough Approximations , 1998, Rough Sets and Current Trends in Computing.

[14]  Ian H. Witten,et al.  Generating Accurate Rule Sets Without Global Optimization , 1998, ICML.

[15]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[16]  Jaroslaw Stepaniuk,et al.  Optimizations of Rough Set Model , 1998, Fundam. Informaticae.

[17]  Szymon Wilk,et al.  Rough Set Based Data Exploration Using ROSE System , 1999, ISMIS.

[18]  Qiang Shen,et al.  Rough set-aided keyword reduction for text categorization , 2001, Appl. Artif. Intell..

[19]  Qiang Shen,et al.  Semantics-preserving dimensionality reduction: rough and fuzzy-rough-based approaches , 2004, IEEE Transactions on Knowledge and Data Engineering.

[20]  Andrzej Skowron,et al.  Rudiments of rough sets , 2007, Inf. Sci..

[21]  Qiang Shen,et al.  Fuzzy-Rough Sets Assisted Attribute Selection , 2007, IEEE Transactions on Fuzzy Systems.

[22]  Bastiaan Kleijn Feature Selection forClassification , 2007 .

[23]  Andrzej Skowron,et al.  Rough sets: Some extensions , 2007, Inf. Sci..