A Robust Feature Normalization Scheme and an Optimized Clustering Method for Anomaly-Based Intrusion Detection System

Intrusion detection system(IDS) has played a central role as an appliance to effectively defend our crucial computer systems or networks against attackers on the Internet. Traditional IDSs employ signature-based methods or anomaly-based methods which rely on labeled training data. However, they have several problems, for example, it consumes huge amounts of cost and time to acquire the labeled training data, and they often experienced difficulty in detecting new types of attack. In order to cope with the problems, many researchers have proposed various kinds of algorithms for several years. Although they do not require labeled data for training and have the capability to detect unforeseen attacks, they are based on the assumption that the ratio of attack to normal is extremely small. However, the assumption may not be satisfied in a realistic situation because some attacks, most notably the denial-of-service attacks, consist of a large number of simultaneous connections. Consequently if the assumption fails, the performance of the algorithm will deteriorate. In this paper, we present a new normalization and clustering method that can overcome a limitation on the attack ratio of the training data. We evaluated our method using KDD Cup 1999 data set. Evaluation results show that performance of our approach is constant irrespective of an increase in the attack ratio.

[1]  Christopher Leckie,et al.  Unsupervised Anomaly Detection in Network Intrusion Detection Using Clusters , 2005, ACSC.

[2]  Salvatore J. Stolfo,et al.  A Geometric Framework for Unsupervised Anomaly Detection , 2002, Applications of Data Mining in Computer Security.

[3]  Leonid Portnoy,et al.  Intrusion detection with unlabeled data using clustering , 2000 .

[4]  Huan Liu,et al.  Subspace clustering for high dimensional data: a review , 2004, SKDD.

[5]  Brian Everitt,et al.  Cluster analysis , 1974 .

[6]  R.K. Cunningham,et al.  Evaluating intrusion detection systems: the 1998 DARPA off-line intrusion detection evaluation , 2000, Proceedings DARPA Information Survivability Conference and Exposition. DISCEX'00.

[7]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[8]  Geoffrey H. Ball,et al.  ISODATA, A NOVEL METHOD OF DATA ANALYSIS AND PATTERN CLASSIFICATION , 1965 .

[9]  Eleazar Eskin,et al.  A GEOMETRIC FRAMEWORK FOR UNSUPERVISED ANOMALY DETECTION: DETECTING INTRUSIONS IN UNLABELED DATA , 2002 .

[10]  Bernhard Schölkopf,et al.  Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.

[11]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[12]  Barak A. Pearlmutter,et al.  Detecting intrusions using system calls: alternative data models , 1999, Proceedings of the 1999 IEEE Symposium on Security and Privacy (Cat. No.99CB36344).

[13]  Harold S. Javitz,et al.  The NIDES Statistical Component Description and Justification , 1994 .

[14]  Ali A. Ghorbani,et al.  Y-means: a clustering method for intrusion detection , 2003, CCECE 2003 - Canadian Conference on Electrical and Computer Engineering. Toward a Caring and Humane Technology (Cat. No.03CH37436).

[15]  P. Laskov,et al.  Intrusion Detection in Unlabeled Data with Quarter-sphere Support Vector Machines , 2004, Prax. Inf.verarb. Kommun..

[16]  Dorothy E. Denning,et al.  An Intrusion-Detection Model , 1987, IEEE Transactions on Software Engineering.