Rough set model based feature selection for mixed-type data with feature space decomposition

Abstract Feature selection plays an important role in the classification problems associated with expert and intelligent systems. The central idea behind feature selection is to identify important input features in order to reduce the dimensionality of the input space while maintaining or improving classification performance. Traditional feature selection approaches were designed to handle either categorical or numerical features, but not the mix of both that often arises in real datasets. In this paper, we propose a novel feature selection algorithm for classifying mixed-type data, based on a rough set model, called feature selection for mixed-type data with feature space decomposition (FSMSD). This can handle both categorical and numerical features by utilizing rough set theory with a heterogeneous Euclidean-overlap metric, and can be applied to mixed-type data. It also uses feature space decomposition to preserve the properties of multi-valued categorical features, thereby reducing information loss and preserving the features’ physical meaning. The proposed algorithm was compared with four benchmark methods using real mixed-type datasets and biomedical datasets, and its performance was promising, indicating that it will be helpful to users of expert and intelligent systems.

[1]  Tony R. Martinez,et al.  Improved Heterogeneous Distance Functions , 1996, J. Artif. Intell. Res..

[2]  D. Dubois,et al.  ROUGH FUZZY SETS AND FUZZY ROUGH SETS , 1990 .

[3]  T. Poggio,et al.  Prediction of central nervous system embryonal tumour outcome based on gene expression , 2002, Nature.

[4]  Rossitza Setchi,et al.  Feature selection using Joint Mutual Information Maximisation , 2015, Expert Syst. Appl..

[5]  Hua Zhao,et al.  Mixed feature selection in incomplete decision table , 2014, Knowl. Based Syst..

[6]  Kezhi Mao,et al.  Feature selection algorithm for mixed data with both nominal and continuous features , 2007, Pattern Recognit. Lett..

[7]  Witold Pedrycz,et al.  Measuring relevance between discrete and continuous features based on neighborhood mutual information , 2011, Expert Syst. Appl..

[8]  Igor Kononenko,et al.  Estimating Attributes: Analysis and Extensions of RELIEF , 1994, ECML.

[9]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[10]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[11]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[12]  Degang Chen,et al.  Attribute Reduction for Heterogeneous Data Based on the Combination of Classical and Fuzzy Rough Set Models , 2014, IEEE Transactions on Fuzzy Systems.

[13]  Pat Langley,et al.  Selection of Relevant Features and Examples in Machine Learning , 1997, Artif. Intell..

[14]  A. Orth,et al.  Large-scale analysis of the human and mouse transcriptomes , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[15]  Marko Robnik-Sikonja,et al.  Theoretical and Empirical Analysis of ReliefF and RReliefF , 2003, Machine Learning.

[16]  Hui Wang,et al.  Nearest neighbors by neighborhood counting , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Mark A. Hall,et al.  Correlation-based Feature Selection for Machine Learning , 2003 .

[18]  Huiqing Liu,et al.  A comparative study on feature selection and classification methods using gene expression profiles and proteomic patterns. , 2002, Genome informatics. International Conference on Genome Informatics.

[19]  Qinghua Hu,et al.  Neighborhood classifiers , 2008, Expert Syst. Appl..

[20]  T. Lin Granulation and nearest neighborhoods: rough set approach , 2001 .

[21]  Gleb Gusev,et al.  Efficient High-Order Interaction-Aware Feature Selection Based on Conditional Mutual Information , 2016, NIPS.

[22]  Pedro Larrañaga,et al.  A review of feature selection techniques in bioinformatics , 2007, Bioinform..

[23]  Todd,et al.  Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning , 2002, Nature Medicine.

[24]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Hui-Huang Hsu,et al.  Hybrid feature selection by combining filters and wrappers , 2011, Expert Syst. Appl..

[26]  Jugal K. Kalita,et al.  MIFS-ND: A mutual information-based feature selection method , 2014, Expert Syst. Appl..

[27]  R. Stolzenberg,et al.  Multiple Regression Analysis , 2004 .

[28]  Ulrich H.-G. Kreßel,et al.  Pairwise classification and support vector machines , 1999 .

[29]  Witold Pedrycz,et al.  Selecting Discrete and Continuous Features Based on Neighborhood Decision Error Minimization , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).