Unsupervised fuzzy-rough set-based dimensionality reduction

Each year worldwide, more and more data is collected. In fact, it is estimated that the amount of data collected and stored at least doubles every 2years. Of this data, a large percentage is unlabelled or has labels which are incomplete or missing. It is because this data is so large that it becomes very difficult for humans to manually assign labels to data objects. Additionally, many real-world application datasets such as those in gene expression analysis, and text classification are also of large dimensionality. This further frustrates the process of label assignment for domain experts as not all of the features are relevant or necessary in order to assign a given label. Hence unsupervised feature selection is required. For supervised learning, feature selection algorithms attempt to maximise a given function of predictive accuracy. This function typically considers the ability of feature vectors to reflect decision class labels. However, for the unsupervised learning task, decision class labels are not provided, which poses questions such as: which features should be retained? In fact, not all features are important and some are irrelevant, redundant or noisy. In this paper, several unsupervised FS approaches are presented which are based on fuzzy-rough sets. These approaches require no thresholding information, are domain-independent, and can operate on real-valued data without the need for discretisation. They offer a significant reduction in dimensionality whilst retaining the semantics of the data, and can even result in supersets of the supervised fuzzy-rough approaches. The approaches are compared with some supervised techniques and are shown to retain useful features.

[1]  Yiyu Yao,et al.  A Comparative Study of Fuzzy Sets and Rough Sets , 1998 .

[2]  Ian H. Witten,et al.  Generating Accurate Rule Sets Without Global Optimization , 1998, ICML.

[3]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[4]  Didier Dubois,et al.  Putting Rough Sets and Fuzzy Sets Together , 1992, Intelligent Decision Support.

[5]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[6]  S. Lanteri,et al.  Chemometric analysis of Tuscan olive oils , 1989 .

[7]  Christos Faloutsos,et al.  A fast and effective method to find correlations among attributes in databases , 2007, Data Mining and Knowledge Discovery.

[8]  Qiang Shen,et al.  Selecting informative features with fuzzy-rough sets and its application for complex systems monitoring , 2004, Pattern Recognit..

[9]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[10]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[11]  Richard Jensen,et al.  Measures for Unsupervised Fuzzy-Rough Feature Selection , 2009, 2009 Ninth International Conference on Intelligent Systems Design and Applications.

[12]  Charles A. Micchelli,et al.  Maximum entropy and maximum likelihood criteria for feature selection from multivariate data , 2000, 2000 IEEE International Symposium on Circuits and Systems. Emerging Technologies for the 21st Century. Proceedings (IEEE Cat No.00CH36353).

[13]  Nicolaj Søndberg-Madsen,et al.  Unsupervised Feature Subset Selection , 2003 .

[14]  Anna Maria Radzikowska,et al.  A comparative study of fuzzy rough sets , 2002, Fuzzy Sets Syst..

[15]  Daphne Koller,et al.  Toward Optimal Feature Selection , 1996, ICML.

[16]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[17]  Qiang Shen,et al.  Fuzzy Entropy-assisted Fuzzy-Rough Feature Selection , 2006, 2006 IEEE International Conference on Fuzzy Systems.

[18]  Josef Kittler,et al.  Pattern recognition : a statistical approach , 1982 .

[19]  Richard P. Heydorn,et al.  Redundancy in Feature Extraction , 1971, IEEE Transactions on Computers.

[20]  Flávio Bortolozzi,et al.  Unsupervised feature selection using multi-objective genetic algorithms for handwritten word recognition , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[21]  Qiang Shen,et al.  Semantics-preserving dimensionality reduction: rough and fuzzy-rough-based approaches , 2004, IEEE Transactions on Knowledge and Data Engineering.

[22]  Chris Cornelis,et al.  Feature Selection with Fuzzy Decision Reducts , 2008, RSKT.

[23]  Filippo Menczer,et al.  Evolutionary model selection in unsupervised learning , 2002, Intell. Data Anal..

[24]  Subrata K. Das,et al.  Feature Selection with a Linear Dependence Measure , 1971, IEEE Transactions on Computers.

[25]  Sankar K. Pal,et al.  Unsupervised feature evaluation: a neuro-fuzzy approach , 2000, IEEE Trans. Neural Networks Learn. Syst..

[26]  Ali Hamzeh,et al.  Unsupervised Feature Selection Using Feature Density Functions , 2009 .

[27]  Jennifer G. Dy Unsupervised Feature Selection , 2007 .

[28]  Sankar K. Pal,et al.  Unsupervised Feature Selection , 2004 .

[29]  Carla E. Brodley,et al.  Feature Subset Selection and Order Identification for Unsupervised Learning , 2000, ICML.

[30]  Mark A. Hall,et al.  Correlation-based Feature Selection for Machine Learning , 2003 .

[31]  Huan Liu,et al.  Feature Selection for Clustering , 2000, Encyclopedia of Database Systems.

[32]  Frank Scherbaum,et al.  Unsupervised feature selection and general pattern discovery using Self-Organizing Maps for gaining insights into the nature of seismic wavefields , 2009, Comput. Geosci..

[33]  J. Hartigan Statistical theory in clustering , 1985 .

[34]  C. A. Murthy,et al.  Unsupervised Feature Selection Using Feature Similarity , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[35]  Boris Mirkin,et al.  Mathematical Classification and Clustering , 1996 .

[36]  Dawei Song,et al.  Beyond Redundancies: A Metric-Invariant Method for Unsupervised Feature Selection , 2010, IEEE Transactions on Knowledge and Data Engineering.

[37]  Yiyu Yao,et al.  Constructive and Algebraic Methods of the Theory of Rough Sets , 1998, Inf. Sci..

[38]  Sankar K. Pal,et al.  Case generation using rough sets with fuzzy representation , 2004, IEEE Transactions on Knowledge and Data Engineering.

[39]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Qiang Shen,et al.  New Approaches to Fuzzy-Rough Feature Selection , 2009, IEEE Transactions on Fuzzy Systems.