Detecting Malicious Code by Exploiting Dependencies of System-call Groups

In this paper we present an elaborated graph-based algorithmic technique for efficient malware detection. More precisely, we utilize the system-call dependency graphs (or, for short ScD graphs), obtained by capturing taint analysis traces and a set of various similarity metrics in order to detect whether an unknown test sample is a malicious or a benign one. For the sake of generalization, we decide to empower our model against strong mutations by applying our detection technique on a weighted directed graph resulting from ScD graph after grouping disjoint subsets of its vertices. Additionally, we have developed a similarity metric, which we call NP-similarity, that combines qualitative, quantitative, and relational characteristics that are spread among the members of known malware families to archives a clear distinction between graph-representations of malware and the ones of benign software. Finally, we evaluate our detection model and compare our results against the results achieved by a variety of techniques proving the potentials of our model.

[1]  Christopher Krügel,et al.  Scalable, Behavior-Based Malware Clustering , 2009, NDSS.

[2]  Robert Luh,et al.  BEHAVIOR BASED MALWARE RECOGNITION , 2011 .

[3]  Kangbin Yim,et al.  Malware Obfuscation Techniques: A Brief Survey , 2010, 2010 International Conference on Broadband, Wireless Computing, Communication and Applications.

[4]  Daniel T. Larose,et al.  Discovering Knowledge in Data: An Introduction to Data Mining , 2005 .

[5]  Robert Layton,et al.  Malware Detection Based on Structural and Behavioural Features of API Calls , 2010 .

[6]  James Bailey,et al.  Mining Minimal Contrast Subgraph Patterns , 2006, SDM.

[7]  Eric Filiol,et al.  Behavioral detection of malware: from a survey towards an established taxonomy , 2008, Journal in Computer Virology.

[8]  Aditya P. Mathur,et al.  A Survey of Malware Detection Techniques , 2007 .

[9]  Byung Ro Moon,et al.  Malware detection based on dependency graph using hybrid genetic algorithm , 2010, GECCO '10.

[10]  J. T. Curtis,et al.  An Ordination of the Upland Forest Communities of Southern Wisconsin , 1957 .

[11]  Stavros D. Nikolopoulos,et al.  A Survey on Algorithmic Techniques for Malware Detection , 2013 .

[12]  Christopher Krügel,et al.  Effective and Efficient Malware Detection at the End Host , 2009, USENIX Security Symposium.

[13]  Somesh Jha,et al.  Synthesizing Near-Optimal Malware Specifications from Suspicious Behaviors , 2010, 2010 IEEE Symposium on Security and Privacy.

[14]  Yanfang Ye,et al.  IMDS: intelligent malware detection system , 2007, KDD '07.

[15]  Somesh Jha,et al.  Semantics-aware malware detection , 2005, 2005 IEEE Symposium on Security and Privacy (S&P'05).

[16]  Christopher Krügel,et al.  Dynamic Analysis of Malicious Code , 2006, Journal in Computer Virology.

[17]  Andrew Honig,et al.  Practical Malware Analysis: The Hands-On Guide to Dissecting Malicious Software , 2012 .

[18]  Kirti Mathur,et al.  A Survey on Techniques in Detection and Analyzing Malware , 2013 .

[19]  Dawn Xiaodong Song,et al.  Malware Analysis with Tree Automata Inference , 2011, CAV.

[20]  Engin Kirda,et al.  A View on Current Malware Behaviors , 2009, LEET.

[21]  Suhaimi Ibrahim,et al.  Camouflage in Malware: from Encryption to Metamorphism , 2012 .

[22]  Survey on Malware Detection Methods , 2009 .

[23]  Douglas S. Reeves,et al.  Fast malware classification by automated behavioral graph matching , 2010, CSIIRW '10.