System Call-Based Detection of Malicious Processes

System call analysis is a behavioral malware detection technique that is popular due to its promising detection results and ease of implementation. This study describes a system that uses system call analysis to detect malware that evade traditional defenses. The system monitors executing processes to identify compromised hosts in production environments. Experimental results compare the effectiveness of multiple feature extraction strategies and detectors based on their detection accuracy at low false positive rates. Logistic regression and support vector machines consistently outperform log-likelihood ratio and signature detectors as processing and detection methods. A feature selection study indicates that a relatively small set of system call 3-grams provide detection accuracy comparable to that of more complex models. A case study indicates that the detection system performs well against a variety of malware samples, benign workloads, and host configurations.

[1]  Qiang Chen,et al.  Probabilistic techniques for intrusion detection based on computer audit data , 2001, IEEE Trans. Syst. Man Cybern. Part A.

[2]  H. V. Trees Detection, Estimation, And Modulation Theory , 2001 .

[3]  Christopher Krügel,et al.  Limits of Static Analysis for Malware Detection , 2007, Twenty-Third Annual Computer Security Applications Conference (ACSAC 2007).

[4]  Mattia Monga,et al.  Detecting Self-mutating Malware Using Control-Flow Graph Matching , 2006, DIMVA.

[5]  Qinghua Zhang,et al.  MetaAware: Identifying Metamorphic Malware , 2007, Twenty-Third Annual Computer Security Applications Conference (ACSAC 2007).

[6]  Jörg Kindermann,et al.  Text Categorization with Support Vector Machines. How to Represent Texts in Input Space? , 2002, Machine Learning.

[7]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[8]  Christopher Krügel,et al.  A survey on automated dynamic malware-analysis techniques and tools , 2012, CSUR.

[9]  Dae-Ki Kang,et al.  Learning classifiers for misuse and anomaly detection using a bag of system calls representation , 2005, Proceedings from the Sixth Annual IEEE SMC Information Assurance Workshop.

[10]  Barak A. Pearlmutter,et al.  Detecting intrusions using system calls: alternative data models , 1999, Proceedings of the 1999 IEEE Symposium on Security and Privacy (Cat. No.99CB36344).

[11]  Claudia Eckert,et al.  Leveraging String Kernels for Malware Detection , 2013, NSS.

[12]  Thomas Stibor,et al.  A supervised topic transition model for detecting malicious system call sequences , 2011, KDMS '11.

[13]  Christopher Krügel,et al.  Exploring Multiple Execution Paths for Malware Analysis , 2007, 2007 IEEE Symposium on Security and Privacy (SP '07).

[14]  Christopher Krügel,et al.  A quantitative study of accuracy in system call-based malware detection , 2012, ISSTA 2012.

[15]  Christopher Krügel,et al.  AccessMiner: using system-centric models for malware protection , 2010, CCS '10.

[16]  Stefan Axelsson,et al.  The base-rate fallacy and the difficulty of intrusion detection , 2000, TSEC.

[17]  Muddassar Farooq,et al.  Towards a Theory of Generalizing System Call Representation for In-Execution Malware Detection , 2010, 2010 IEEE International Conference on Communications.

[18]  Herbert Bos,et al.  Pointless tainting?: evaluating the practicality of pointer tainting , 2009, EuroSys '09.

[19]  Geoff Holmes,et al.  Multinomial Naive Bayes for Text Categorization Revisited , 2004, Australian Conference on Artificial Intelligence.

[20]  David Madigan,et al.  Large-Scale Bayesian Logistic Regression for Text Categorization , 2007, Technometrics.

[21]  Stefan Katzenbeisser,et al.  Proactive Detection of Computer Worms Using Model Checking , 2010, IEEE Transactions on Dependable and Secure Computing.

[22]  Lior Rokach,et al.  Detection of unknown computer worms based on behavioral classification of the host , 2008, Comput. Stat. Data Anal..

[23]  Aixia Guo,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2014 .

[24]  V. Rao Vemuri,et al.  Using Text Categorization Techniques for Intrusion Detection , 2002, USENIX Security Symposium.

[25]  Somesh Jha,et al.  Semantics-aware malware detection , 2005, 2005 IEEE Symposium on Security and Privacy (S&P'05).

[26]  Stephanie Forrest,et al.  A sense of self for Unix processes , 1996, Proceedings 1996 IEEE Symposium on Security and Privacy.

[27]  Tong Zhang,et al.  Solving large scale linear prediction problems using stochastic gradient descent algorithms , 2004, ICML.