Feature selection in mixed data: A method using a novel fuzzy rough set-based information entropy

Feature selection in the data with different types of feature values, i.e., the heterogeneous or mixed data, is especially of practical importance because such types of data sets widely exist in real world. The key issue for feature selection in mixed data is how to properly deal with different types of the features or attributes in the data set. Motivated by the fuzzy rough set theory which allows different fuzzy relations to be defined for different types of attributes to measure the similarity between objects and in view of the effectiveness of entropy to measure information uncertainty, we propose in this paper a fuzzy rough set-based information entropy for feature selection in a mixed data set. It is proved that the newly-defined entropy meets the common requirement of monotonicity and can equivalently characterize the existing attribute reductions in the fuzzy rough set theory. Then, a feature selection algorithm is formulated based on the proposed entropy and a filter-wrapper method is suggested to select the best feature subset in terms of classification accuracy. An extensive numerical experiment is further conducted to assess the performance of the feature selection method and the results are satisfactory. HighlightsA novel fuzzy rough set-based information entropy is constructed for mixed data.The proposed entropy can equivalently characterize the existing attribute reductions in the fuzzy rough set theory.A feature selection algorithm is formulated based on the proposed entropy.A filter-wrapper method is suggested to select a best feature subset.

[1]  Wang Guo,et al.  Decision Table Reduction based on Conditional Information Entropy , 2002 .

[2]  Xizhao Wang,et al.  Attributes Reduction Using Fuzzy Rough Sets , 2008, IEEE Transactions on Fuzzy Systems.

[3]  Qiang Shen,et al.  New Approaches to Fuzzy-Rough Feature Selection , 2009, IEEE Transactions on Fuzzy Systems.

[4]  C. A. Murthy,et al.  Unsupervised Feature Selection Using Feature Similarity , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Wei-Zhi Wu,et al.  Generalized fuzzy rough sets , 2003, Inf. Sci..

[6]  Qiang Shen,et al.  Centre for Intelligent Systems and Their Applications Fuzzy Rough Attribute Reduction with Application to Web Categorization Fuzzy Rough Attribute Reduction with Application to Web Categorization Fuzzy Sets and Systems ( ) – Fuzzy–rough Attribute Reduction with Application to Web Categorization , 2022 .

[7]  Koen Vanhoof,et al.  Comparison of Discretization Methods for Preprocessing Data for Pyramidal Growing Network Classification Method , 2009 .

[8]  Witold Pedrycz,et al.  Gaussian kernel based fuzzy rough sets: Model, uncertainty measures and applications , 2010, Int. J. Approx. Reason..

[9]  Andrew K. C. Wong,et al.  Synthesizing Statistical Knowledge from Incomplete Mixed-Mode Data , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Qinghua Hu,et al.  Comparative analysis on margin based feature selection algorithms , 2014, Int. J. Mach. Learn. Cybern..

[11]  Jerzy W. Grzymala-Busse,et al.  Global discretization of continuous attributes as preprocessing for machine learning , 1996, Int. J. Approx. Reason..

[12]  Jianhua Dai,et al.  Fuzzy rough set model for set-valued data , 2013, Fuzzy Sets Syst..

[13]  Qinghua Hu,et al.  Fuzzy Probabilistic Approximation Spaces and Their Information Measures , 2006, IEEE Trans. Fuzzy Syst..

[14]  Yee Leung,et al.  An uncertainty measure in partition-based fuzzy rough sets , 2005, Int. J. Gen. Syst..

[15]  Jun Zhou,et al.  Object Detection Via Structural Feature Selection and Shape Model , 2013, IEEE Transactions on Image Processing.

[16]  Xizhao Wang,et al.  Learning from big data with uncertainty - editorial , 2015, J. Intell. Fuzzy Syst..

[17]  Yung C. Shin,et al.  A variational Bayesian framework for group feature selection , 2013, Int. J. Mach. Learn. Cybern..

[18]  Qinghua Hu,et al.  Fuzzy probabilistic approximation spaces and their information measures , 2006, IEEE Transactions on Fuzzy Systems.

[19]  Nehad N. Morsi,et al.  Axiomatics for fuzzy rough sets , 1998, Fuzzy Sets Syst..

[20]  Hong Zhao,et al.  Fast randomized algorithm with restart strategy for minimal test cost feature selection , 2015, Int. J. Mach. Learn. Cybern..

[21]  L. Valverde On the structure of F-indistinguishability operators , 1985 .

[22]  Ron Kohavi,et al.  Supervised and Unsupervised Discretization of Continuous Features , 1995, ICML.

[23]  Kun Yang,et al.  Dynamic non-parametric joint sentiment topic mixture model , 2015, Knowl. Based Syst..

[24]  Xiao Bai,et al.  Discriminative Features for Image Classification and Retrieval , 2011, 2011 Sixth International Conference on Image and Graphics.

[25]  D. Dubois,et al.  ROUGH FUZZY SETS AND FUZZY ROUGH SETS , 1990 .

[26]  Qiang Shen,et al.  Fuzzy-Rough Sets Assisted Attribute Selection , 2007, IEEE Transactions on Fuzzy Systems.

[27]  Chris Cornelis,et al.  Attribute selection with fuzzy decision reducts , 2010, Inf. Sci..

[28]  Ming-Wen Shao,et al.  Generalized fuzzy rough approximation operators determined by fuzzy implicators , 2013, Int. J. Approx. Reason..

[29]  Huan Liu,et al.  Consistency-based search in feature selection , 2003, Artif. Intell..

[30]  Qinghua Hu,et al.  Information-preserving hybrid data reduction based on fuzzy-rough techniques , 2006, Pattern Recognit. Lett..

[31]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Yee Leung,et al.  On characterizations of (I, J)-fuzzy rough approximation operators , 2005, Fuzzy Sets Syst..

[33]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[34]  Zhoujun Li,et al.  A novel variable precision (θ, σ)-fuzzy rough set model based on fuzzy granules , 2014, Fuzzy Sets Syst..

[35]  Jane Labadin,et al.  Feature selection based on mutual information , 2015, 2015 9th International Conference on IT in Asia (CITA).

[36]  Wen-Xiu Zhang,et al.  An axiomatic characterization of a fuzzy generalization of rough sets , 2004, Inf. Sci..

[37]  Hui Wang,et al.  Nearest neighbors by neighborhood counting , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Xiao Zhang,et al.  Multi-confidence rule acquisition and confidence-preserved attribute reduction in interval-valued decision systems , 2014, Int. J. Approx. Reason..

[39]  Yong Xu,et al.  Sparse group LASSO based uncertain feature selection , 2014, Int. J. Mach. Learn. Cybern..

[40]  Jiye Liang,et al.  Determining the number of clusters using information entropy for mixed data , 2012, Pattern Recognit..

[41]  Anna Maria Radzikowska,et al.  A comparative study of fuzzy rough sets , 2002, Fuzzy Sets Syst..

[42]  Qinghua Hu,et al.  Parameterized attribute reduction with Gaussian kernel based fuzzy rough sets , 2011, Inf. Sci..

[43]  Degang Chen,et al.  The Model of Fuzzy Variable Precision Rough Sets , 2009, IEEE Transactions on Fuzzy Systems.

[44]  Jerzy W. Grzymala-Busse,et al.  Rough Sets , 1995, Commun. ACM.

[45]  Jiye Liang,et al.  Fuzzy-rough feature selection accelerator , 2015, Fuzzy Sets Syst..

[46]  Dun Liu,et al.  A fuzzy rough set approach for incremental feature selection on hybrid information systems , 2015, Fuzzy Sets Syst..

[47]  Andrew K. C. Wong,et al.  Class-Dependent Discretization for Inductive Learning from Continuous and Mixed-Mode Data , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[48]  Xu Zhou,et al.  Effective algorithms of the Moore-Penrose inverse matrices for extreme learning machine , 2015, Intell. Data Anal..

[49]  Kezhi Mao,et al.  Feature selection algorithm for mixed data with both nominal and continuous features , 2007, Pattern Recognit. Lett..

[50]  Huan Liu,et al.  Feature Selection for Classification , 1997, Intell. Data Anal..

[51]  Wei-Zhi Wu,et al.  Constructive and axiomatic approaches of fuzzy approximation operators , 2004, Inf. Sci..

[52]  John Q. Gan,et al.  A filter-dominating hybrid sequential forward floating search method for feature subset selection in high-dimensional space , 2014, Int. J. Mach. Learn. Cybern..

[53]  Zexuan Zhu,et al.  Wrapper–Filter Feature Selection Algorithm Using a Memetic Framework , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[54]  Witold Pedrycz,et al.  The Development of Fuzzy Rough Sets with the Use of Structures and Algebras of Axiomatic Fuzzy Sets , 2009, IEEE Transactions on Knowledge and Data Engineering.

[55]  Degang Chen,et al.  Attribute Reduction for Heterogeneous Data Based on the Combination of Classical and Fuzzy Rough Set Models , 2014, IEEE Transactions on Fuzzy Systems.

[56]  Huan Liu,et al.  Efficient Feature Selection via Analysis of Relevance and Redundancy , 2004, J. Mach. Learn. Res..

[57]  Qinghua Hu,et al.  Neighborhood rough set based heterogeneous feature subset selection , 2008, Inf. Sci..

[58]  Rajen B. Bhatt,et al.  On fuzzy-rough sets approach to feature selection , 2005, Pattern Recognit. Lett..

[59]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[60]  Xizhao Wang,et al.  Fuzziness based sample categorization for classifier performance improvement , 2015, J. Intell. Fuzzy Syst..

[61]  Xizhao Wang,et al.  Segment Based Decision Tree Induction With Continuous Valued Attributes , 2015, IEEE Transactions on Cybernetics.

[62]  Xizhao Wang,et al.  On the generalization of fuzzy rough sets , 2005, IEEE Transactions on Fuzzy Systems.

[63]  Yee Leung,et al.  Generalized fuzzy rough sets determined by a triangular norm , 2008, Inf. Sci..

[64]  J. Recasens,et al.  UPPER AND LOWER APPROXIMATIONS OF FUZZY SETS , 2000 .