Data Mining Approaches for Intrusion Detection

In this paper we discuss our research in developing general and systematic methods for intrusion detection. The key ideas are to use data mining techniques to discover consistent and useful patterns of system features that describe program and user behavior, and use the set of relevant system features to compute (inductively learned) classifiers that can recognize anomalies and known intrusions. Using experiments on the sendmail system call data and the network tcpdump data, we demonstrate that we can construct concise and accurate classifiers to detect anomalies. We provide an overview on two general data mining algorithms that we have implemented: the association rules algorithm and the frequent episodes algorithm. These algorithms can be used to compute the intra-and inter-audit record patterns, which are essential in describing program or user behavior. The discovered patterns can guide the audit data gathering process and facilitate feature selection. To meet the challenges of both efficient learning (mining) and real-time detection, we propose an agent-based architecture for intrusion detection systems where the learning agents continuously compute and provide the updated (detection) models to the detection agents.

[1]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[2]  Alfonso Valdes,et al.  Live Traffic Analysis of TCP/IP Gateways , 1998, NDSS.

[3]  Richard A. Kemmerer,et al.  State Transition Analysis: A Rule-Based Intrusion Detection Approach , 1995, IEEE Trans. Software Eng..

[4]  Salvatore J. Stolfo,et al.  JAM: Java Agents for Meta-Learning over Distributed Databases , 1997, KDD.

[5]  Harold Joseph Highland,et al.  The 17th NSCS abstructArtificial Intelligence and Intrusion Detection: Current and Future Directions : Jeremy Frank, University of California, Davis, CA , 1995 .

[6]  Jeffrey O. Kephart,et al.  Blueprint for a Computer Immune System , 1999 .

[7]  Arthur B. Maccabe,et al.  The architecture of a network level intrusion detection system , 1990 .

[8]  Gregory Piatetsky-Shapiro,et al.  The KDD process for extracting useful knowledge from volumes of data , 1996, CACM.

[9]  Sandeep Kumar,et al.  A Software Architecture to Support Misuse Intrusion Detection , 1995 .

[10]  Vern Paxson,et al.  End-to-end Internet packet dynamics , 1997, SIGCOMM '97.

[11]  Chris Hare,et al.  Internet Security: Professional Reference , 1996 .

[12]  T. Lane,et al.  Sequence Matching and Learning in Anomaly Detection for Computer Security , 1997 .

[13]  Salvatore J. Stolfo,et al.  Toward parallel and distributed learning by meta-learning , 1993 .

[14]  Karl N. Levitt,et al.  Automated detection of vulnerabilities in privileged programs by execution monitoring , 1994, Tenth Annual Computer Security Applications Conference.

[15]  Stephanie Forrest,et al.  A sense of self for Unix processes , 1996, Proceedings 1996 IEEE Symposium on Security and Privacy.

[16]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules and sequential patterns , 1996 .

[17]  Derek Atkins,et al.  Internet security professional reference , 1996 .

[18]  Heikki Mannila,et al.  Discovering Frequent Episodes in Sequences , 1995, KDD.

[19]  Philip K. Chan,et al.  Learning Patterns from Unix Process Execution Traces for Intrusion Detection , 1997 .

[20]  Ramakrishnan Srikant,et al.  Mining generalized association rules , 1995, Future Gener. Comput. Syst..

[21]  Vern Paxson,et al.  Bro: a system for detecting network intruders in real-time , 1998, Comput. Networks.

[22]  S. M. Bellovin,et al.  Security problems in the TCP/IP protocol suite , 1989, CCRV.

[23]  W. Richard Stevens Tcp/ip illustrated- volume 1 , 1994 .