论文信息 - An anomaly detection system based on variable N-gram features and one-class SVM

An anomaly detection system based on variable N-gram features and one-class SVM

Context: Run-time detection of system anomalies at the host level remains a challenging task. Existing techniques suffer from high rates of false alarms, hindering large-scale deployment of anomaly detection techniques in commercial settings. Objective: To reduce the false alarm rate, we present a new anomaly detection system based on a novel feature extraction technique, which combines the frequency with the temporal information from system call traces, and on one-class support vector machine (OC-SVM) detector.Method: The proposed feature extraction approach starts by segmenting the system call traces into multiple n-grams of variable length and mapping them to fixed-size sparse feature vectors, which are then used to train OC-SVM detectors.Results: The results achieved on a real-world system call dataset show that our feature vectors with up to 6-grams outperform the term vector models (using the most common weighting schemes) proposed in related work. More importantly, our anomaly detection system using OC-SVM with a Gaussian kernel, trained on our feature vectors, achieves a higher-level of detection accuracy (with a lower false alarm rate) than that achieved by Markovian and n-gram based models as well as by the state-of-the-art anomaly detection techniques.Conclusion: The proposed feature extraction approach from traces of events provides new and general data representations that are suitable for training standard one-class machine learning algorithms, while preserving the temporal dependencies among these events.

[1] Carla Marceau,et al. Characterizing the behavior of a program using multiple-length N-grams , 2001, NSPW '00.

[2] Stephanie Forrest,et al. The Evolution of System-Call Monitoring , 2008, 2008 Annual Computer Security Applications Conference (ACSAC).

[3] Jun Xu,et al. Non-Control-Data Attacks Are Realistic Threats , 2005, USENIX Security Symposium.

[4] Weibo Gong,et al. Anomaly detection using call stack information , 2003, 2003 Symposium on Security and Privacy, 2003..

[5] Stefan Axelsson,et al. The base-rate fallacy and the difficulty of intrusion detection , 2000, TSEC.

[6] Debin Gao,et al. Gray-box extraction of execution graphs for anomaly detection , 2004, CCS '04.

[7] AxelssonStefan. The base-rate fallacy and the difficulty of intrusion detection , 2000 .

[8] Chun-Hung Richard Lin,et al. Intrusion detection system: A comprehensive review , 2013, J. Netw. Comput. Appl..

[9] Chee Kheong Siew,et al. Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[10] V. Rao Vemuri,et al. Use of K-Nearest Neighbor classifier for intrusion detection , 2002, Comput. Secur..

[11] Robert Sabourin,et al. Combining Hidden Markov Models for Improved Anomaly Detection , 2009, 2009 IEEE International Conference on Communications.

[12] Tom Fawcett,et al. An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[13] Stephanie Forrest,et al. A sense of self for Unix processes , 1996, Proceedings 1996 IEEE Symposium on Security and Privacy.

[14] David A. Wagner,et al. Mimicry attacks on host-based intrusion detection systems , 2002, CCS '02.

[15] David A. Wagner,et al. Intrusion detection via static analysis , 2001, Proceedings 2001 IEEE Symposium on Security and Privacy. S&P 2001.

[16] G. G. Stokes. "J." , 1890, The New Yale Book of Quotations.

[17] Fabio Roli,et al. Adversarial attacks against intrusion detection systems: Taxonomy, solutions and open issues , 2013, Inf. Sci..

[18] Somesh Jha,et al. Markov chains, classifiers, and intrusion detection , 2001, Proceedings. 14th IEEE Computer Security Foundations Workshop, 2001..

[19] Jiankun Hu,et al. Host-Based Anomaly Intrusion Detection , 2010, Handbook of Information and Communication Security.

[20] Salvatore J. Stolfo,et al. Learning Rules from System Call Arguments and Sequences for Anomaly 20 Detection , 2003 .

[21] D. Vere-Jones. Markov Chains , 1972, Nature.

[22] Hinrich Schütze,et al. Introduction to information retrieval , 2008 .

[23] Samuel Williams,et al. Optimization of sparse matrix-vector multiplication on emerging multicore platforms , 2009, Parallel Comput..

[24] L. Baum,et al. A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains , 1970 .

[25] S D Walter,et al. The partial area under the summary ROC curve , 2005, Statistics in medicine.

[26] VARUN CHANDOLA,et al. Anomaly detection: A survey , 2009, CSUR.

[27] Vasant Honavar,et al. Learning Classifiers for Misuse Detection Using a Bag of System Calls Representation , 2005, ISI.

[28] Robert H. Deng,et al. On Detection of Erratic Arguments , 2011, SecureComm.

[29] Dae-Ki Kang,et al. Learning classifiers for misuse and anomaly detection using a bag of system calls representation , 2005, Proceedings from the Sixth Annual IEEE SMC Information Assurance Workshop.

[30] A. Nur Zincir-Heywood,et al. Mimicry Attacks Demystified: What Can Attackers Do to Evade Detection? , 2008, 2008 Sixth Annual Conference on Privacy, Security and Trust.

[31] Christopher Krügel,et al. Bayesian event classification for intrusion detection , 2003, 19th Annual Computer Security Applications Conference, 2003. Proceedings..

[32] Barak A. Pearlmutter,et al. Detecting intrusions using system calls: alternative data models , 1999, Proceedings of the 1999 IEEE Symposium on Security and Privacy (Cat. No.99CB36344).

[33] Dong Xiang,et al. Information-theoretic measures for anomaly detection , 2001, Proceedings 2001 IEEE Symposium on Security and Privacy. S&P 2001.

[34] Somesh Jha,et al. Detecting Manipulated Remote Call Streams , 2002, USENIX Security Symposium.

[35] Carrie Gates,et al. Challenging the anomaly detection paradigm: a provocative discussion , 2006, NSPW '06.

[36] Robert P. W. Duin,et al. Support Vector Data Description , 2004, Machine Learning.

[37] Konrad Rieck,et al. A close look on n-grams in intrusion detection: anomaly detection vs. classification , 2013, AISec.

[38] Niels Provos,et al. Improving Host Security with System Call Policies , 2003, USENIX Security Symposium.

[39] Sheng-Hsun Hsu,et al. Application of SVM and ANN for intrusion detection , 2005, Comput. Oper. Res..

[40] Charles X. Ling,et al. Using AUC and accuracy in evaluating learning algorithms , 2005, IEEE Transactions on Knowledge and Data Engineering.

[41] Salvatore J. Stolfo,et al. Casting out Demons: Sanitizing Training Data for Anomaly Sensors , 2008, 2008 IEEE Symposium on Security and Privacy (sp 2008).

[42] R. Sekar,et al. Dataflow anomaly detection , 2006, 2006 IEEE Symposium on Security and Privacy (S&P'06).

[43] Kymie M. C. Tan,et al. Undermining an Anomaly-Based Intrusion Detection System Using Common Exploits , 2002, RAID.

[44] Anup Ghosh,et al. Simple, state-based approaches to program-based anomaly detection , 2002, TSEC.

[45] J. Hanley,et al. The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[46] A. Föhrenbach,et al. SIMPLE++ , 2000, OR Spectr..

[47] Debin Gao,et al. On Gray-Box Program Tracking for Anomaly Detection , 2004, USENIX Security Symposium.

[48] Kymie M. C. Tan,et al. "Why 6?" Defining the operational limits of stide, an anomaly-based intrusion detector , 2002, Proceedings 2002 IEEE Symposium on Security and Privacy.

[49] Somesh Jha,et al. Efficient Context-Sensitive Intrusion Detection , 2004, NDSS.

[50] Stefan Axelsson,et al. Intrusion Detection Systems: A Survey and Taxonomy , 2002 .

[51] Kuldip K. Paliwal,et al. Intrusion detection using text processing techniques with a kernel based similarity measure , 2007, Comput. Secur..

[52] Vern Paxson,et al. Outside the Closed World: On Using Machine Learning for Network Intrusion Detection , 2010, 2010 IEEE Symposium on Security and Privacy.

[53] Jiankun Hu,et al. A Semantic Approach to Host-Based Intrusion Detection Systems Using Contiguousand Discontiguous System Call Patterns , 2014, IEEE Transactions on Computers.

[54] Bernhard Schölkopf,et al. Support Vector Method for Novelty Detection , 1999, NIPS.

[55] R. Sekar,et al. A practical mimicry attack against powerful system-call monitors , 2008, ASIACCS '08.

[56] Christopher Krügel,et al. On the Detection of Anomalous System Call Arguments , 2003, ESORICS.

[57] Carsten Willems,et al. Learning and Classification of Malware Behavior , 2008, DIMVA.

[58] Jiankun Hu,et al. Generation of a new IDS test dataset: Time to retire the KDD collection , 2013, 2013 IEEE Wireless Communications and Networking Conference (WCNC).

[59] Gerard Salton,et al. Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer , 1989 .

[60] Salvatore J. Stolfo,et al. Using artificial anomalies to detect unknown and known network intrusions , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[61] Marc Dacier,et al. Intrusion Detection Using Variable-Length Audit Trail Patterns , 2000, Recent Advances in Intrusion Detection.

[62] William W. Cohen. Fast Effective Rule Induction , 1995, ICML.

[63] Dorothy E. Denning,et al. An Intrusion-Detection Model , 1986, 1986 IEEE Symposium on Security and Privacy.

[64] Christopher Krügel,et al. Automating Mimicry Attacks Using Static Binary Analysis , 2005, USENIX Security Symposium.

[65] V. Rao Vemuri,et al. Intrusion Detection Using Text Processing Techniques with a Binary-Weighted Cosine Metric , 2006 .

[66] Stefano Zanero,et al. Detecting Intrusions through System Call Sequence and Argument Analysis , 2010, IEEE Transactions on Dependable and Secure Computing.

[67] Giovanni Vigna,et al. Exploiting Execution Context for the Detection of Anomalous System Calls , 2007, RAID.

[68] V. Rao Vemuri,et al. Using Text Categorization Techniques for Intrusion Detection , 2002, USENIX Security Symposium.

[69] Somesh Jha,et al. Formalizing sensitivity in static analysis for intrusion detection , 2004, IEEE Symposium on Security and Privacy, 2004. Proceedings. 2004.

[70] Robert Sabourin,et al. A survey of techniques for incremental learning of HMM parameters , 2012, Inf. Sci..

[71] R. Sekar,et al. A fast automaton-based method for detecting anomalous program behaviors , 2001, Proceedings 2001 IEEE Symposium on Security and Privacy. S&P 2001.

[72] John McHugh,et al. Defending Yourself: The Role of Intrusion Detection Systems , 2000, IEEE Software.