Learning a new distance metric to improve an SVM-clustering based intrusion detection system

In the recent decades, many intrusion detection systems (IDSs) have been proposed to enhance the security of networks. A class of IDSs is based on clustering of network traffic into normal and abnormal according to some features of the connections. The selected distance function to measure the similarity and dissimilarity of sessions' features affect the performance of clustering based IDSs. The most popular distance metric, which is used in designing these IDSs is the Euclidean distance function. In this paper, we argue that more appropriate distance functions can be deployed for IDSs. We propose a method of learning an appropriate distance function according to a set of supervision information. This metric is derived by solving a semi-definite optimization problem, which attempts to decrease the distance between the similar, and increases the distances between the dissimilar feature vectors. The evaluation of this scheme over Kyoto2006+ dataset shows that the new distance metric, can improve the performance of a support vector machine (SVM) clustering based IDS in terms of normal detection and false positive rates.

[1]  Pavel Pudil,et al.  Introduction to Statistical Pattern Recognition , 2006 .

[2]  Jugal K. Kalita,et al.  Network Anomaly Detection: Methods, Systems and Tools , 2014, IEEE Communications Surveys & Tutorials.

[3]  Jung-Min Park,et al.  An overview of anomaly detection techniques: Existing solutions and latest technological trends , 2007, Comput. Networks.

[4]  Bhavani M. Thuraisingham,et al.  A new intrusion detection system using support vector machines and hierarchical clustering , 2007, The VLDB Journal.

[5]  Ali Ghodsi,et al.  Distance metric learning vs. Fisher discriminant analysis , 2008, AAAI 2008.

[6]  Gabriel Maciá-Fernández,et al.  Anomaly-based network intrusion detection: Techniques, systems and challenges , 2009, Comput. Secur..

[7]  Andrew H. Sung,et al.  Identifying Significant Features for Network Forensic Analysis Using Artificial Intelligence Techniques , 2003, Int. J. Digit. EVid..

[8]  Brian Kulis,et al.  Metric Learning: A Survey , 2013, Found. Trends Mach. Learn..

[9]  Wei-Yang Lin,et al.  Intrusion detection by machine learning: A review , 2009, Expert Syst. Appl..

[10]  Yasser Yasami,et al.  A novel unsupervised classification approach for network anomaly detection by k-Means clustering and ID3 decision tree learning methods , 2010, The Journal of Supercomputing.

[11]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[12]  Johan Löfberg,et al.  YALMIP : a toolbox for modeling and optimization in MATLAB , 2004 .

[13]  Keinosuke Fukunaga,et al.  Introduction to statistical pattern recognition (2nd ed.) , 1990 .

[14]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[15]  Hiroki Takakura,et al.  Toward a more practical unsupervised anomaly detection system , 2013, Inf. Sci..