An Approximation Decision Entropy Based Decision Tree Algorithm and Its Application in Intrusion Detection

In this paper, we propose a novel decision tree algorithm DTADE within the framework of rough set theory, and apply DTADE to intrusion detection. We define a new information entropy model -- approximation decision entropy (ADE) in rough sets, which combines the concept of conditional entropy in Shannon's information theory and the concept of approximation accuracy in rough sets. In algorithm DTADE, ADE is adopted as the heuristic information for the selection of splitting attributes. Moreover, we present a method of decision tree pre-pruning based on the concept of knowledge entropy proposed by Duntsch and Gediga. Finally, the KDDCUP99 data set is used to verify the effectiveness of our algorithm in intrusion detection.

[1]  Charles F. Hockett,et al.  A mathematical theory of communication , 1948, MOCO.

[2]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[3]  Jerzy W. Grzymala-Busse,et al.  Rough Sets , 1995, Commun. ACM.

[4]  Ron Kohavi,et al.  Supervised and Unsupervised Discretization of Continuous Features , 1995, ICML.

[5]  David W. Aha,et al.  Simplifying decision trees: A survey , 1997, The Knowledge Engineering Review.

[6]  Ivo Düntsch,et al.  Uncertainty Measures of Rough Set Prediction , 1998, Artif. Intell..

[7]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[8]  Aleksander Øhrn ROSETTA Technical Reference Manual , 2001 .

[9]  Xiangyang Li,et al.  Decision Tree Classifiers for Computer Intrusion Detection , 2001, Scalable Comput. Pract. Exp..

[10]  Wang Guo,et al.  Decision Table Reduction based on Conditional Information Entropy , 2002 .

[11]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[12]  Jiye Liang,et al.  The Information Entropy, Rough Entropy And Knowledge Granulation In Rough Set Theory , 2004, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[13]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[14]  Liu Qi A Heuristic Algorithm of Knowledge Reduction , 2005 .

[15]  Xu Zhang,et al.  A Quick Attribute Reduction Algorithm with Complexity of max(O(|C||U|),O(|C|~2|U/C|)) , 2006 .