Towards sequencing malicious system calls

System-call analysis is recognized as one of the most promising approaches to malware detection due to its ability to facilitate detection of malware variants as well as zero-day malware. However, one of the key challenges of system-call based analysis - which prevents it from being used in real-time detection systems - is the excessive size/dimensionality of the system-call sequences that correspond to most current day malware. The main contributions of our work are two-fold: (1) We propose a novel approach to malware system-call sequence representation that ensures more effective detection and analysis of individual malware instances as well as their corresponding malware families. In particular, our approach results in a considerable reduction in the size of system-call sequences of presented software/malware instances, while not falling victim to the so-called “dummy insertion attacks”. (2) Building upon (1), we also propose a novel supervised-learning based framework for detection of malicious system-call sequences in previously unseen software programs. This framework can also be used for effective identification and auditing of benign software programs that are not necessary malicious.

[1]  Elmar Gerhards-Padilla,et al.  Bee Master: Detecting Host-Based Code Injection Attacks , 2014, DIMVA.

[2]  Philip S. Yu,et al.  Mining Colossal Frequent Patterns by Core Pattern Fusion , 2007, 2007 IEEE 23rd International Conference on Data Engineering.