Fuzzy–Rough Simultaneous Attribute Selection and Feature Extraction Algorithm

Among the huge number of attributes or features present in real-life data sets, only a small fraction of them are effective to represent the data set accurately. Prior to analysis of the data set, selecting or extracting relevant and significant features is an important preprocessing step used for pattern recognition, data mining, and machine learning. In this regard, a novel dimensionality reduction method, based on fuzzy-rough sets, that simultaneously selects attributes and extracts features using the concept of feature significance is presented. The method is based on maximizing both the relevance and significance of the reduced feature set, whereby redundancy therein is removed. This paper also presents classical and neighborhood rough sets for computing the relevance and significance of the feature set and compares their performances with that of fuzzy-rough sets based on the predictive accuracy of nearest neighbor rule, support vector machine, and decision tree. An important finding is that the proposed dimensionality reduction method based on fuzzy-rough sets is shown to be more effective for generating a relevant and significant feature subset. The effectiveness of the proposed fuzzy-rough-set-based dimensionality reduction method, along with a comparison with existing attribute selection and feature extraction methods, is demonstrated on real-life data sets.

[1]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[2]  Stephen A. Billings,et al.  Feature Subset Selection and Ranking for Data Dimensionality Reduction , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Qiang Shen,et al.  Rough set-aided keyword reduction for text categorization , 2001, Appl. Artif. Intell..

[4]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[5]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[6]  Janusz Zalewski,et al.  Rough sets: Theoretical aspects of reasoning about data , 1996 .

[7]  Jordi Vitrià,et al.  On the Selection and Classification of Independent Features , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Xizhao Wang,et al.  Attributes Reduction Using Fuzzy Rough Sets , 2008, IEEE Transactions on Fuzzy Systems.

[9]  Josef Kittler,et al.  Pattern recognition : a statistical approach , 1982 .

[10]  Sankar K. Pal,et al.  Feature Selection Using f-Information Measures in Fuzzy Approximation Spaces , 2010, IEEE Transactions on Knowledge and Data Engineering.

[11]  Sankar K. Pal,et al.  Fuzzy–Rough Sets for Information Measures and Selection of Relevant Genes From Microarray Data , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[12]  Pradipta Maji,et al.  Fuzzy–Rough Supervised Attribute Clustering Algorithm and Classification of Microarray Data , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[13]  Sankar K. Pal,et al.  Neuro-Fuzzy Pattern Recognition: Methods in Soft Computing , 1999 .

[14]  Pradipta Maji,et al.  Rough set based maximum relevance-maximum significance criterion and Gene selection from microarray data , 2011, Int. J. Approx. Reason..

[15]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[16]  Qinghua Hu,et al.  A Novel Algorithm for Finding Reducts With Fuzzy Rough Sets , 2012, IEEE Transactions on Fuzzy Systems.

[17]  Qinghua Hu,et al.  Neighborhood rough set based heterogeneous feature subset selection , 2008, Inf. Sci..

[18]  Xizhao Wang,et al.  Building a Rule-Based Classifier—A Fuzzy-Rough Set Approach , 2010, IEEE Transactions on Knowledge and Data Engineering.

[19]  A. Aspin Tables for use in comparisons whose accuracy involves two variances, separately estimated. , 1949, Biometrika.

[20]  Witold Pedrycz,et al.  Selecting Discrete and Continuous Features Based on Neighborhood Decision Error Minimization , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[21]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[22]  Qinghua Hu,et al.  Fuzzy probabilistic approximation spaces and their information measures , 2006, IEEE Transactions on Fuzzy Systems.

[23]  Partha Garai,et al.  On fuzzy-rough attribute selection: Criteria of Max-Dependency, Max-Relevance, Min-Redundancy, and Max-Significance , 2013, Appl. Soft Comput..

[24]  Qiang Shen,et al.  Semantics-preserving dimensionality reduction: rough and fuzzy-rough-based approaches , 2004, IEEE Transactions on Knowledge and Data Engineering.

[25]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Qiang Shen,et al.  A Distance Measure Approach to Exploring the Rough Set Boundary Region for Attribute Reduction , 2010, IEEE Transactions on Knowledge and Data Engineering.

[27]  D. Dubois,et al.  ROUGH FUZZY SETS AND FUZZY ROUGH SETS , 1990 .

[28]  Sankar K. Pal,et al.  Rough-Fuzzy Pattern Recognition: Applications in Bioinformatics and Medical Imaging , 2012 .

[29]  Qiang Shen,et al.  New Approaches to Fuzzy-Rough Feature Selection , 2009, IEEE Transactions on Fuzzy Systems.