K-Nearest-Neighbours with a novel similarity measure for intrusion detection

K-Nearest-Neighbours is one of the simplest yet effective classification methods. The core computation behind it is to calculate the distance from a query point to all of its neighbours and to choose the closest one. The Euclidean distance is the most frequent choice, although other distances are sometimes required. This paper explores a simple yet effective similarity definition within Nearest Neighbours for intrusion detection applications. This novel similarity rule is fast to compute and achieves a very satisfactory performance on the intrusion detection benchmark data sets tested.

[1]  Michel Verleysen,et al.  The Concentration of Fractional Distances , 2007, IEEE Transactions on Knowledge and Data Engineering.

[2]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[3]  Aiko Pras,et al.  A Labeled Data Set for Flow-Based Intrusion Detection , 2009, IPOM.

[4]  Klaus-Robert Müller,et al.  Machine Learning for Intrusion Detection , 2007, NATO ASI Mining Massive Data Sets for Security.

[5]  ElkanCharles Results of the KDD'99 classifier learning , 2000 .

[6]  Ping Li,et al.  Fast Near Neighbor Search in High-Dimensional Binary Data , 2012, ECML/PKDD.

[7]  Tony R. Martinez,et al.  Improved Heterogeneous Distance Functions , 1996, J. Artif. Intell. Res..