A relative decision entropy-based feature selection approach

Rough set theory has been proven to be an effective tool for feature selection. To avoid the exponential computation in exhaustive methods, many heuristic feature selection algorithms have been proposed in rough sets. However, these algorithms still suffer from high computational cost. In this paper, we propose a novel heuristic feature selection algorithm (called FSMRDE) in rough sets. To measure the significance of features in FSMRDE, we propose a new model of relative decision entropy, which is an extension of Shannon?s information entropy in rough sets. Moreover, to test the effectiveness of FSMRDE, we apply it to intrusion detection and other application domains. Experimental results show that by using the relative decision entropy-based feature significance as heuristic information, FSMRDE is efficient for feature selection. In particular, FSMRDE is able to achieve good scalability for large data sets. HighlightsWe proposed a novel heuristic feature selection algorithm in rough sets.We presented a new information entropy model - relative decision entropy.We proved that relative decision entropy is monotonic with respect to the partial order of partitions.We applied our feature selection algorithm to intrusion detection.The effectiveness of our algorithm was shown on KDD-99 data set and some other data sets.

[1]  Sankar K. Pal,et al.  Feature Selection Using f-Information Measures in Fuzzy Approximation Spaces , 2010, IEEE Transactions on Knowledge and Data Engineering.

[2]  Yuchang Lu,et al.  Feature ranking in rough sets , 2003, AI Commun..

[3]  JensenRichard,et al.  Semantics-Preserving Dimensionality Reduction , 2004 .

[4]  Aleksander Øhrn ROSETTA Technical Reference Manual , 2001 .

[5]  Mohd Aizaini Maarof,et al.  Feature Selection Using Rough Set in Intrusion Detection , 2006, TENCON 2006 - 2006 IEEE Region 10 Conference.

[6]  Marcel Abendroth,et al.  Data Mining Practical Machine Learning Tools And Techniques With Java Implementations , 2016 .

[7]  Salvatore J. Stolfo,et al.  Data Mining Approaches for Intrusion Detection , 1998, USENIX Security Symposium.

[8]  Jingtao Yao,et al.  An Enhanced Support Vector Machine Model for Intrusion Detection , 2006, RSKT.

[9]  Ivo Diintsch Uncertainty measures of rough set prediction , 2003 .

[10]  Wang Guo,et al.  Decision Table Reduction based on Conditional Information Entropy , 2002 .

[11]  Lei Liu,et al.  Feature selection with dynamic mutual information , 2009, Pattern Recognit..

[12]  Witold Pedrycz,et al.  An efficient accelerator for attribute reduction from incomplete data in rough set framework , 2011, Pattern Recognit..

[13]  Mark A. Hall,et al.  Correlation-based Feature Selection for Machine Learning , 2003 .

[14]  Qinghua Hu,et al.  Hybrid attribute reduction based on a novel fuzzy-rough model and information granulation , 2007, Pattern Recognit..

[15]  Sanmay Das,et al.  Filters, Wrappers and a Boosting-Based Hybrid for Feature Selection , 2001, ICML.

[16]  Dominik Slezak,et al.  Approximate Entropy Reducts , 2002, Fundam. Informaticae.

[17]  Qinghua Hu,et al.  Neighborhood rough set based heterogeneous feature subset selection , 2008, Inf. Sci..

[18]  Guo Wenzhong,et al.  Feature Selection of the Intrusion Detection Data Based on Particle Swarm Optimization and Neighborhood Reduction , 2010 .

[19]  Jiye Liang,et al.  A New Method for Measuring the Uncertainty in Incomplete Information Systems , 2009, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[20]  Xiaohua Hu Knowledge discovery in databases: an attribute-oriented rough set approach , 1996 .

[21]  Xiangyang Wang,et al.  Feature selection based on rough sets and particle swarm optimization , 2007, Pattern Recognit. Lett..

[22]  Xu Zhang,et al.  A Quick Attribute Reduction Algorithm with Complexity of max(O(|C||U|),O(|C|~2|U/C|)) , 2006 .

[23]  Huan Liu,et al.  Consistency Based Feature Selection , 2000, PAKDD.

[24]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[25]  Qiang Shen,et al.  Exploring the boundary region of tolerance rough sets for feature selection , 2009, Pattern Recognit..

[26]  Liu Qi A Heuristic Algorithm of Knowledge Reduction , 2005 .

[27]  Keki B. Irani,et al.  Multi-interval discretization of continuos attributes as pre-processing for classi cation learning , 1993, IJCAI 1993.

[28]  Qiang Shen,et al.  Selecting informative features with fuzzy-rough sets and its application for complex systems monitoring , 2004, Pattern Recognit..

[29]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[30]  Richard Nock,et al.  A hybrid filter/wrapper approach of feature selection using information theory , 2002, Pattern Recognit..

[31]  Janusz Zalewski,et al.  Rough sets: Theoretical aspects of reasoning about data , 1996 .

[32]  Jiye Liang,et al.  Information entropy, rough entropy and knowledge granulation in incomplete information systems , 2006, Int. J. Gen. Syst..

[33]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[34]  Witold Pedrycz,et al.  Kernelized Fuzzy Rough Sets and Their Applications , 2011, IEEE Transactions on Knowledge and Data Engineering.

[35]  Feng Jiang,et al.  A Rough Set Approach to Feature Selection Based on Relative Decision Entropy , 2011, RSKT.

[36]  Jiye Liang,et al.  The Algorithm on Knowledge Reduction in Incomplete Information Systems , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[37]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[38]  Qiang Shen,et al.  A rough-fuzzy approach for generating classification rules , 2002, Pattern Recognit..

[39]  Filiberto Pla,et al.  Supervised feature selection by clustering using conditional mutual information-based distances , 2010, Pattern Recognit..

[40]  Shuicheng Yan,et al.  Correntropy based feature selection using binary projection , 2011, Pattern Recognit..

[41]  Qiang Shen,et al.  Computational Intelligence and Feature Selection - Rough and Fuzzy Approaches , 2008, IEEE Press series on computational intelligence.

[42]  Qiang Shen,et al.  Semantics-preserving dimensionality reduction: rough and fuzzy-rough-based approaches , 2004, IEEE Transactions on Knowledge and Data Engineering.

[43]  Zdzislaw Pawlak,et al.  Rough sets and intelligent data analysis , 2002, Inf. Sci..

[44]  Qiang Shen,et al.  Finding Rough Set Reducts with Ant Colony Optimization , 2003 .

[45]  Witold Pedrycz,et al.  Feature analysis through information granulation and fuzzy sets , 2002, Pattern Recognit..