A Novel Approach of Rough Set-Based Attribute Reduction Using Fuzzy Discernibility Matrix

Rough set approach is one of effective attribute reduction (also called a feature selection) methods that can preserve the meaning of the attributes(features). However, most of existing algorithms mainly aim at information systems or decision tables with discrete values. Therefore, in this paper, we introduce a novel rough set-based method followed by establishing a fuzzy discernibility matrix by using distance preserving strategy for attribute reduction, and only choose fisher discriminant analysis with kernels as discriminant criteria for testing the effectiveness of selected attribute subsets with relatively higher fitness values, since the proposed method is independent of post-analysis algorithms (predictors). Experimental results show that the classifiers developed using the selected attribute subsets have better or comparable performance on all eight UCI benchmark datasets than those obtained by all attributes. Thus, our newly developed method can, in most cases, get effective attribute subsets. In addition, this method can be directly incorporated into other learning algorithms, such as PCA, SVM and etc. and can also be more easily applied to many real applications, such as web categorization, image recognition and etc.

[1]  Ning Zhong,et al.  Using Rough Sets with Heuristics for Feature Selection , 1999, RSFDGrC.

[2]  Yang Ming,et al.  An Incremental Updating Algorithm of the Computation of a Core Based on the Improved Discernibility Matrix , 2006 .

[3]  Sankar K. Pal,et al.  Web mining in soft computing framework: relevance, state of the art and future directions , 2002, IEEE Trans. Neural Networks.

[4]  Andrzej Skowron,et al.  The Discernibility Matrices and Functions in Information Systems , 1992, Intelligent Decision Support.

[5]  Sushmita Mitra,et al.  Feature Selection Using Rough Sets , 2006, Multi-Objective Machine Learning.

[6]  B. Scholkopf,et al.  Fisher discriminant analysis with kernels , 1999, Neural Networks for Signal Processing IX: Proceedings of the 1999 IEEE Signal Processing Society Workshop (Cat. No.98TH8468).

[7]  Songcan Chen,et al.  Adaptively weighted sub-pattern PCA for face recognition , 2005, Neurocomputing.

[8]  Davy Janssens,et al.  Evaluating the performance of cost-based discretization versus entropy- and error-based discretization , 2006, Comput. Oper. Res..

[9]  Naftali Tishby,et al.  Margin based feature selection - theory and algorithms , 2004, ICML.

[10]  Yu Wu,et al.  Theoretical study on attribute reduction of rough set theory: comparison of algebra and information views , 2004 .

[11]  Andrzej Skowron,et al.  Rough set methods in feature selection and recognition , 2003, Pattern Recognit. Lett..

[12]  Wang Ju,et al.  Reduction algorithms based on discernibility matrix: The ordered attributes method , 2001, Journal of Computer Science and Technology.

[13]  R Kahavi,et al.  Wrapper for feature subset selection , 1997 .

[14]  Anil K. Jain,et al.  Statistical Pattern Recognition: A Review , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Qiang Shen,et al.  Semantics-preserving dimensionality reduction: rough and fuzzy-rough-based approaches , 2004, IEEE Transactions on Knowledge and Data Engineering.

[16]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[17]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[18]  Ning Zhong,et al.  Using Rough Sets with Heuristics for Feature Selection , 1999, Journal of Intelligent Information Systems.