Mixed feature selection in incomplete decision table

Abstract Feature selection in incomplete decision table has gained considerable attention in recently. However many feature selection methods are mainly designed for incomplete data with categorical features. In this paper, we introduce an extended rough set model, which is based on neighborhood-tolerance relation and is applicable to incomplete data with mixed categorical and numerical features. Neighborhood-tolerance conditional entropy is proposed from this model, which is an uncertainty measure and can be used to evaluate feature subset. It is known that dependency is an important feature evaluation measure based on rough set theory. The comparison and analysis of classification complexity are made between the two measures and it is indicated that neighborhood-tolerance conditional entropy is a more effective feature evaluation criterion than dependency in incomplete decision table. Then the heuristic feature selection algorithm based on neighborhood-tolerance conditional entropy is constructed. Experimental results show that our proposal is applicable and effective to incomplete mixed data.

[1]  Zhongzhi Shi,et al.  A fast approach to attribute reduction in incomplete decision systems with tolerance relation-based rough sets , 2009, Inf. Sci..

[2]  Shuang-Hong Yang,et al.  Ieee Transactions on Knowledge and Data Engineering, Vol. X, No. X Discriminative Feature Selection by Nonparametric Bayes Error Minimization , 2022 .

[3]  Larry A. Rendell,et al.  A Practical Approach to Feature Selection , 1992, ML.

[4]  Jiye Liang,et al.  Combination Entropy and Combination Granulation in Incomplete Information System , 2006, RSKT.

[5]  Yee Leung,et al.  Maximal consistent block technique for rule acquisition in incomplete information systems , 2003, Inf. Sci..

[6]  Jianhua Dai,et al.  Entropy measures and granularity measures for set-valued information systems , 2013, Inf. Sci..

[7]  Jiye Liang,et al.  Information entropy, rough entropy and knowledge granulation in incomplete information systems , 2006, Int. J. Gen. Syst..

[8]  Witold Pedrycz,et al.  Selecting Discrete and Continuous Features Based on Neighborhood Decision Error Minimization , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[9]  Jiye Liang,et al.  A new method for measuring uncertainty and fuzziness in rough set theory , 2002, Int. J. Gen. Syst..

[10]  Lin Sun,et al.  Feature selection using rough entropy-based uncertainty measures in incomplete decision systems , 2012, Knowl. Based Syst..

[11]  Qinghua Hu,et al.  Neighborhood based sample and feature selection for SVM classification learning , 2011, Neurocomputing.

[12]  Qinghua Hu,et al.  Neighborhood rough set based heterogeneous feature subset selection , 2008, Inf. Sci..

[13]  Marzena Kryszkiewicz,et al.  Rough Set Approach to Incomplete Information Systems , 1998, Inf. Sci..

[14]  Jiye Liang,et al.  A New Method for Measuring the Uncertainty in Incomplete Information Systems , 2009, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[15]  Liang Liu,et al.  Attribute selection based on a new conditional entropy for incomplete decision systems , 2013, Knowl. Based Syst..

[16]  Jianhua Dai,et al.  Rough set approach to incomplete numerical data , 2013, Inf. Sci..

[17]  K. Thangavel,et al.  Dimensionality reduction based on rough set theory: A review , 2009, Appl. Soft Comput..

[18]  Qinghua Hu,et al.  Mixed feature selection based on granulation and approximation , 2008, Knowl. Based Syst..

[19]  Lei Wang,et al.  On Similarity Preserving Feature Selection , 2013, IEEE Transactions on Knowledge and Data Engineering.

[20]  C. A. Murthy,et al.  Unsupervised Feature Selection Using Feature Similarity , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[21]  Witold Pedrycz,et al.  An efficient accelerator for attribute reduction from incomplete data in rough set framework , 2011, Pattern Recognit..

[22]  Kari Torkkola,et al.  Feature Extraction by Non-Parametric Mutual Information Maximization , 2003, J. Mach. Learn. Res..