An anomaly intrusion detection method by clustering normal user behavior

For detecting an intrusion based on the anomaly of a user's activities, previous works are concentrated on statistical techniques or frequent episode mining in order to analyze an audit data set. However, since they mainly analyze the average behavior of a user's activities, some anomalies can be detected inaccurately. This paper proposes an anomaly detection method which utilizes a clustering algorithm for modeling the normal behavior of a user's activities in a host. Since clustering can identify an arbitrary number of dense ranges in an analysis domain, it can eliminate the inaccuracy caused by statistical analysis. Consequently, it can model the frequent activities of a user more accurately than the statistical analysis does. The common knowledge of activities in the transactions of a user is represented by the occurrence frequency of similar activities by the unit of a transaction as well as the repetitive ratio of similar activities in each transaction. The proposed method also addresses how to maintain identified common knowledge as a concise profile. Furthermore, this paper addresses the selection of good features that can improve the detection rate of anomalous behavior in an on-line transaction.

[1]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[2]  Harold S. Javitz,et al.  The NIDES Statistical Component Description and Justification , 1994 .

[3]  Harold S. Javitz,et al.  The SRI IDES statistical anomaly detector , 1991, Proceedings. 1991 IEEE Computer Society Symposium on Research in Security and Privacy.

[4]  Dimitrios Gunopulos,et al.  Automatic subspace clustering of high dimensional data for data mining applications , 1998, SIGMOD '98.

[5]  Koral Ilgun,et al.  USTAT: a real-time intrusion detection system for UNIX , 1993, Proceedings 1993 IEEE Computer Society Symposium on Research in Security and Privacy.

[6]  Hannu Toivonen,et al.  Proceedings of the 2nd ACM SIGKDD Workshop on Data Mining in Bioinformatics (BIOKDD 2002), July 23rd, 2002, Edmonton, Alberta, Canada , 2002, BIOKDD.

[7]  H. S. Teng,et al.  Security audit trail analysis using inductively generated predictive rules , 1990, Sixth Conference on Artificial Intelligence for Applications.

[8]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[9]  Heikki Mannila,et al.  Discovery of Frequent Episodes in Event Sequences , 1997, Data Mining and Knowledge Discovery.

[10]  Richard A. Kemmerer,et al.  State Transition Analysis: A Rule-Based Intrusion Detection Approach , 1995, IEEE Trans. Software Eng..

[11]  Sudipto Guha,et al.  CURE: an efficient clustering algorithm for large databases , 1998, SIGMOD '98.

[12]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[13]  Mohammed J. Zaki,et al.  ADMIT: anomaly-based data mining for intrusions , 2002, KDD.

[14]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[15]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[16]  Peter G. Neumann,et al.  EMERALD: Event Monitoring Enabling Responses to Anomalous Live Disturbances , 1997, CCS 2002.

[17]  Salvatore J. Stolfo,et al.  JAM: Java Agents for Meta-Learning over Distributed Databases , 1997, KDD.

[18]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[19]  Salvatore J. Stolfo,et al.  Data Mining Approaches for Intrusion Detection , 1998, USENIX Security Symposium.