Host intrusion detection for long stealthy system call sequences

In this paper, we present SC2, an unsupervised learning classifier for detecting host intrusions from sequences of process system calls. SC2 is a naïve Bayes-like classifier based on Markov Model. We describe the classifier, and then provide experimental results on the University of New Mexico's four system call trace data sets, namely Synthetic Sendmail UNM, Synthetic Sendmail CERT, live lpr UNM and live lpr MIT. SC2 classification results are compared to leading machine learning techniques namely, Naive Bayes multinomial (NBm), C4.5 (decision tree), RIPPER (RP), support vector machine (SVM), and logistic regression (LR). Initial findings show that the accuracy of SC2 is comparable to those of leading classifiers, while SC2 has a better detection rate than some of these classifiers on some data sets. SC2 can classify efficiently very long stealthy sequences by using a backtrack, scale and re-multiply technique, together with estimation of standard IEEE 754-2008 relative error of floating-point arithmetic for an acceptable classification confidence.

[1]  Barak A. Pearlmutter,et al.  Detecting intrusions using system calls: alternative data models , 1999, Proceedings of the 1999 IEEE Symposium on Security and Privacy (Cat. No.99CB36344).

[2]  Alex Bateman,et al.  An introduction to hidden Markov models. , 2007, Current protocols in bioinformatics.

[3]  Weibo Gong,et al.  Anomaly detection using call stack information , 2003, 2003 Symposium on Security and Privacy, 2003..

[4]  Anders Krogh,et al.  Chapter 4 - An introduction to hidden Markov models for biological sequences , 1998 .

[5]  David A. Wagner,et al.  Intrusion detection via static analysis , 2001, Proceedings 2001 IEEE Symposium on Security and Privacy. S&P 2001.

[6]  Dae-Ki Kang,et al.  Learning classifiers for misuse and anomaly detection using a bag of system calls representation , 2005, Proceedings from the Sixth Annual IEEE SMC Information Assurance Workshop.

[7]  Dewan Md. Farid,et al.  Adaptive Intrusion Detection based on Boosting and Naïve Bayesian Classifier , 2011 .

[8]  Stephanie Forrest,et al.  A sense of self for Unix processes , 1996, Proceedings 1996 IEEE Symposium on Security and Privacy.

[9]  Salvatore J. Stolfo,et al.  Modeling system calls for intrusion detection with dynamic window sizes , 2001, Proceedings DARPA Information Survivability Conference and Exposition II. DISCEX'01.

[10]  AbrahamAjith,et al.  Feature deduction and ensemble design of intrusion detection systems , 2005 .

[11]  Daniel T. Larose,et al.  Discovering Knowledge in Data: An Introduction to Data Mining , 2005 .

[12]  Jung-Min Park,et al.  An overview of anomaly detection techniques: Existing solutions and latest technological trends , 2007, Comput. Networks.

[13]  Jiankun Hu,et al.  A program-based anomaly intrusion detection scheme using multiple detection engines and fuzzy inference , 2009, J. Netw. Comput. Appl..

[14]  Ajith Abraham,et al.  Feature deduction and ensemble design of intrusion detection systems , 2005, Comput. Secur..

[15]  Jiankun Hu,et al.  An efficient hidden Markov model training scheme for anomaly intrusion detection of server applications based on system calls , 2004, Proceedings. 2004 12th IEEE International Conference on Networks (ICON 2004) (IEEE Cat. No.04EX955).

[16]  Mohamed Khafallah,et al.  Evaluation of Two Control Strategies for Induction Machine , 2011 .

[17]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[18]  James Demmel,et al.  IEEE Standard for Floating-Point Arithmetic , 2008 .

[19]  K. Hanumantha Rao,et al.  Implementation of Anomaly Detection Technique Using Machine Learning Algorithms , 2011 .

[20]  William Kahan,et al.  Lecture Notes on the Status of IEEE Standard 754 for Binary Floating-Point Arithmetic , 1996 .

[21]  S Krishnaveni,et al.  A Comprehensive Analysis and study in Intrusion Detection System using Data Mining Techniques , 2011 .

[22]  Xinghuo Yu,et al.  A simple and efficient hidden Markov model scheme for host-based anomaly intrusion detection , 2009, IEEE Network.

[23]  Piotr Indyk,et al.  Learning Approximate Sequential Patterns for Classification , 2009, J. Mach. Learn. Res..