K-Means Clustering Approach to Analyze NSL-KDD Intrusion Detection Dataset

1 Abstract— Clustering is the most acceptable technique to analyze the raw data. Clustering can help detect intrusions when our training data is unlabeled, as well as for detecting new and unknown types of intrusions. In this paper we are trying to analyze the NSL-KDD dataset using Simple K-Means clustering algorithm. We tried to cluster the dataset into normal and four of the major attack categories i.e. DoS, Probe, R2L, U2R. Experiments are performed in WEKA environment. Results are verified and validated using test dataset. Our main objective is to provide the complete analysis of NSL-KDD intrusion detection dataset.

[1]  Shijun Yi,et al.  Research of Network Intrusion-Detection System Based on Data Mining , 2012 .

[2]  Ye Qing,et al.  An intrusion detection approach based on data mining , 2010, 2010 2nd International Conference on Future Computer and Communication.

[3]  M. Dutta,et al.  Performance Analysis of Clustering Methods for Outlier Detection , 2012, 2012 Second International Conference on Advanced Computing & Communication Technologies.

[4]  John McHugh,et al.  Testing Intrusion detection systems: a critique of the 1998 and 1999 DARPA intrusion detection system evaluations as performed by Lincoln Laboratory , 2000, TSEC.

[5]  Larry Kerschberg,et al.  Mining for Knowledge in Databases: Goals and General Description of the INLEN System , 1989, Knowledge Discovery in Databases.

[6]  Hari Om,et al.  A hybrid system for reducing the false alarm rate of anomaly intrusion detection system , 2012, 2012 1st International Conference on Recent Advances in Information Technology (RAIT).

[7]  Sergio M. Savaresi,et al.  Unsupervised learning techniques for an intrusion detection system , 2004, SAC '04.

[8]  K Raghuveer,et al.  Performance evaluation of data clustering techniques using KDD Cup-99 Intrusion detection data set , 2012 .

[9]  Salvatore J. Stolfo,et al.  Data Mining Approaches for Intrusion Detection , 1998, USENIX Security Symposium.

[10]  Douglas H. Fisher,et al.  Knowledge Acquisition Via Incremental Conceptual Clustering , 1987, Machine Learning.

[11]  E. Mizutani,et al.  Neuro-Fuzzy and Soft Computing-A Computational Approach to Learning and Machine Intelligence [Book Review] , 1997, IEEE Transactions on Automatic Control.

[12]  A. John,et al.  Survey on data mining techniques to enhance intrusion detection , 2012, 2012 International Conference on Computer Communication and Informatics.

[13]  Samarjeet Borah,et al.  Performance Analysis of AIM-K-means & K-means in Quality Cluster Generation , 2009, ArXiv.

[14]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[15]  W. Yassin,et al.  Intrusion detection based on K-Means clustering and Naïve Bayes classification , 2011, 2011 7th International Conference on Information Technology in Asia.

[16]  Gregory Piatetsky-Shapiro,et al.  Advances in Knowledge Discovery and Data Mining , 2004, Lecture Notes in Computer Science.

[17]  Derek Greene,et al.  Unsupervised Learning and Clustering , 2008, Machine Learning Techniques for Multimedia.

[18]  N. Hundewale,et al.  An intelligent approach for Intrusion Detection based on data mining techniques , 2012, 2012 International Conference on Multimedia Computing and Systems.

[19]  M. Hemalatha,et al.  An evaluation of clustering technique over intrusion detection system , 2012, ICACCI '12.