The Impact of Distance Metrics on K-means Clustering Algorithm Using in Network Intrusion Detection Data

A Network Intrusion Detection System (NIDS) can detect suspicious activities that aimed to harm the network. Since, NIDS help us to keep the networks safer many researchers are motivated to propose more accurate NIDS. K-means clustering algorithm is a distance-based algorithm which widely used in IDS research area. This paper aimed to evaluate the impact of Euclidean and Manhattan distance metrics on Kmeans algorithm using for clustering KDD cup99 intrusion detection data. Experimental results indicate that Manhattan distance metric performs better in terms of performance evaluation metrics than Euclidean distance metric.

[1]  Neil Davey,et al.  Unsupervised learning with normalised data and non-Euclidean norms , 2007, Appl. Soft Comput..

[2]  Ali A. Ghorbani,et al.  Y-means: a clustering method for intrusion detection , 2003, CCECE 2003 - Canadian Conference on Electrical and Computer Engineering. Toward a Caring and Humane Technology (Cat. No.03CH37436).

[3]  Ali A. Ghorbani,et al.  Research on Intrusion Detection and Response: A Survey , 2005, Int. J. Netw. Secur..

[4]  G. W. Milligan,et al.  A study of standardization of variables in cluster analysis , 1988 .

[5]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[6]  Chun-Hung Richard Lin,et al.  Intrusion detection system: A comprehensive review , 2013, J. Netw. Comput. Appl..

[7]  Meng Jianliang,et al.  The Application on Intrusion Detection Based on K-means Cluster Algorithm , 2009, 2009 International Forum on Information Technology and Applications.

[8]  Stefan Axelsson,et al.  Intrusion Detection Systems: A Survey and Taxonomy , 2002 .

[9]  Gary B. Wills,et al.  Unsupervised Clustering Approach for Network Anomaly Detection , 2012, NDT.

[10]  Leonid Portnoy,et al.  Intrusion detection with unlabeled data using clustering , 2000 .