Application of Machine Learning Algorithms to KDD Intrusion Detection Dataset within Misuse Detection Context

A small subset of machine learning algorithms, mostly inductive learning based, applied to the KDD 1999 Cup intrusion detection dataset resulted in dismal performance for user-to-root and remote-to-local attack categories as reported in the recent literature. The uncertainty to explore if other machine learning algorithms can demonstrate better performance compared to the ones already employed constitutes the motivation for the study reported herein. Specifically, exploration of if certain algorithms perform better for certain attack classes and consequently, if a multi-expert classifier design can deliver desired performance measure is of high interest. This paper evaluates performance of a comprehensive set of pattern recognition and machine learning algorithms on four attack categories as found in the KDD 1999 Cup intrusion detection dataset. Results of simulation study implemented to that effect indicated that certain classification algorithms perform better for certain attack categories: a specific algorithm specialized for a given attack category . Consequently, a multi-classifier model, where a specific detection algorithm is associated with an attack category for which it is the most promising, was built. Empirical results obtained through simulation indicate that noticeable performance improvement was achieved for probing, denial of service, and user-to-root

[1]  Yuchun Lee,et al.  Classifiers : adaptive modules in pattern recognition systems , 1989 .

[2]  Stephen Grossberg,et al.  Fuzzy ARTMAP: A neural network architecture for incremental supervised learning of analog multidimensional maps , 1992, IEEE Trans. Neural Networks.

[3]  P. Werbos,et al.  Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .

[4]  Salvatore J. Stolfo,et al.  Mining in a data-flow environment: experience in network intrusion detection , 1999, KDD '99.

[5]  Vipin Kumar,et al.  Finding Clusters of Different Sizes, Shapes, and Densities in Noisy, High Dimensional Data , 2003, SDM.

[6]  Kristopher Kendall,et al.  A Database of Computer Attacks for the Evaluation of Intrusion Detection Systems , 1999 .

[7]  Charles Elkan,et al.  Results of the KDD'99 classifier learning , 2000, SKDD.

[8]  Ramesh K. Agarwal,et al.  PNrule : A New Framework for Learning Classifier Models in Data Mining ( A Cast-Study in Network Intrusion Detection ) Technical Report , 2004 .

[9]  Ivica Kostanic,et al.  Principles of Neurocomputing for Science and Engineering , 2000 .

[10]  Salvatore J. Stolfo,et al.  A data mining framework for building intrusion detection models , 1999, Proceedings of the 1999 IEEE Symposium on Security and Privacy (Cat. No.99CB36344).

[11]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[12]  Dit-Yan Yeung,et al.  Parzen-window network intrusion detectors , 2002, Object recognition supported by user interaction for service robots.

[13]  John A. Hartigan,et al.  Clustering Algorithms , 1975 .

[14]  Itzhak Levin,et al.  KDD-99 classifier learning contest LLSoft's results overview , 2000, SKDD.

[15]  Bruce G. Batchelor,et al.  Pattern Recognition: Ideas in Practice , 1978 .