Performance evaluation of data clustering techniques using KDD Cup-99 Intrusion detection data set

Intrusion-detection systems aim at detecting attacks against computer systems and networks or, in general, against information systems. A number of techniques are available for intrusion detection. Data mining is the one of the efficient technique among them. Intrusion detection and clustering have forever been hot topics in the area of machine learning. Data clustering is a procedure of putting related data into groups.  Clustering procedure clusters the data into groups with the property of inter-group similarity and intra-group dissimilarity. A  clustering  technique  partitions  a data-set  into  several  groups  such  that  the  likeness within a group is larger than amongst groups. Clustering as an intrusion detection technique has long before proved to be beneficial. This paper evaluate four   most representative off-line clustering   techniques:  k-means clustering, fuzzy c-means clustering, Mountain clustering, and Subtractive clustering. These techniques are implemented and tested against KDD cup-99 data set, which is used as a standard benchmark data set for intrusion detection. Performance and accuracy of the four techniques are presented and compared in this paper. The experimental outcomes obtained by applying these algorithms on KDD cup-99 data set demonstrate that k-means and fuzzy c-means clustering algorithms perform well in terms of accuracy and computation time.