Fuzzy-Rough Set Bireducts for Data Reduction

Data reduction is an important step that helps ease the computational intractability for learning techniques when data are large. This is particularly true for the huge datasets that have become commonplace in recent times. The main problem facing both data preprocessors and learning techniques is that data are expanding both in terms of dimensionality and also in terms of the number of data instances. Approaches based on fuzzy-rough sets offer many advantages for both feature selection and classification, particularly for real-valued and noisy data; however, the majority of recent approaches tend to address the task of data reduction in terms of either dimensionality or training data size in isolation. This paper demonstrates how the notion of fuzzy-rough bireducts can be used for the simultaneous reduction of data size and dimensionality. It also shows how bireducts and, therefore, reduced subtables of data can be used not only as a preprocessing tool but also for the learning of compact and robust classifiers. Furthermore, the ideas can also be extended to the unsupervised domain when dealing with unlabeled data. Experimental evaluation of various techniques demonstrate that high levels of simultaneous reduction of both dimensionality and data size can be achieved whilst maintaining robust performance.

[1]  Richard Jensen,et al.  Simultaneous feature and instance selection using fuzzy-rough bireducts , 2013, 2013 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE).

[2]  Spiridon D. Likothanassis,et al.  Integrating feature and instance selection for text classification , 2002, KDD.

[3]  Didier Dubois,et al.  Putting Rough Sets and Fuzzy Sets Together , 1992, Intelligent Decision Support.

[4]  Francisco Herrera,et al.  IFROWANN: Imbalanced Fuzzy-Rough Ordered Weighted Average Nearest Neighbor Classification , 2015, IEEE Transactions on Fuzzy Systems.

[5]  Qiang Shen,et al.  Finding rough and fuzzy-rough set reducts with SAT , 2014, Inf. Sci..

[6]  Qiang Shen,et al.  New Approaches to Fuzzy-Rough Feature Selection , 2009, IEEE Transactions on Fuzzy Systems.

[7]  Anna Maria Radzikowska,et al.  A comparative study of fuzzy rough sets , 2002, Fuzzy Sets Syst..

[8]  Vicenç Torra,et al.  Modeling decisions - information fusion and aggregation operators , 2007 .

[9]  Qiang Shen,et al.  Heuristic search for fuzzy-rough bireducts and its use in classifier ensembles , 2014, 2014 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE).

[10]  Dirk Van,et al.  Ensemble Methods: Foundations and Algorithms , 2012 .

[11]  Qiang Shen,et al.  Feature Selection With Harmony Search , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[12]  Chris Cornelis,et al.  Fuzzy-rough instance selection , 2010, International Conference on Fuzzy Systems.

[13]  Janusz Zalewski,et al.  Rough sets: Theoretical aspects of reasoning about data , 1996 .

[14]  Chris Cornelis,et al.  Fuzzy-rough nearest neighbour classification and prediction , 2011, Theor. Comput. Sci..

[15]  Dominik Slezak,et al.  Ensembles of Bireducts: Towards Robust Classification and Simple Representation , 2011, FGIT.

[16]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[17]  Ron Kohavi,et al.  Feature Selection for Knowledge Discovery and Data Mining , 1998 .

[18]  Dominik Slezak,et al.  Recent Advances in Decision Bireducts: Complexity, Heuristics and Streams , 2013, RSKT.

[19]  Sebastian Widz,et al.  Decision bireducts and approximate decision reducts: Comparison of two approaches to attribute subset ensemble construction , 2012, 2012 Federated Conference on Computer Science and Information Systems (FedCSIS).

[20]  Ian H. Witten,et al.  Generating Accurate Rule Sets Without Global Optimization , 1998, ICML.

[21]  Li Dengfeng,et al.  New similarity measures of intuitionistic fuzzy sets and application to pattern recognitions , 2002, Pattern Recognit. Lett..

[22]  Dengfeng Li,et al.  New similarity measures of intuitionistic fuzzy sets and application to pattern recognitions , 2002, Pattern Recognit. Lett..

[23]  Chris Cornelis,et al.  Attribute selection with fuzzy decision reducts , 2010, Inf. Sci..

[24]  Mykola Pechenizkiy,et al.  Diversity in search strategies for ensemble feature selection , 2005, Inf. Fusion.

[25]  Richard Jensen,et al.  Unsupervised fuzzy-rough set-based dimensionality reduction , 2013, Inf. Sci..