A graph-based model for malware detection and classification using system-call groups

In this paper we present a graph-based model that, utilizing relations between groups of System-calls, detects whether an unknown software sample is malicious or benign, and classifies a malicious software to one of a set of known malware families. More precisely, we utilize the System-call Dependency Graphs (or, for short, ScD-graphs), obtained by traces captured through dynamic taint analysis. We design our model to be resistant against strong mutations applying our detection and classification techniques on a weighted directed graph, namely Group Relation Graph, or Gr-graph for short, resulting from ScD-graph after grouping disjoint subsets of its vertices. For the detection process, we propose the $$\Delta $$Δ-similarity metric, and for the process of classification, we propose the SaMe-similarity and NP-similarity metrics consisting the SaMe-NP similarity. Finally, we evaluate our model for malware detection and classification showing its potentials against malicious software measuring its detection rates and classification accuracy.

[1]  Christopher Krügel,et al.  Scalable, Behavior-Based Malware Clustering , 2009, NDSS.

[2]  Guanhua Yan,et al.  Exploring Discriminatory Features for Automated Malware Classification , 2013, DIMVA.

[3]  James Newsome,et al.  Dynamic Taint Analysis for Automatic Detection, Analysis, and SignatureGeneration of Exploits on Commodity Software , 2005, NDSS.

[4]  Carsten Willems,et al.  Learning and Classification of Malware Behavior , 2008, DIMVA.

[5]  Andrew Honig,et al.  Practical Malware Analysis: The Hands-On Guide to Dissecting Malicious Software , 2012 .

[6]  Kirti Mathur,et al.  A Survey on Techniques in Detection and Analyzing Malware , 2013 .

[7]  Dawn Xiaodong Song,et al.  Malware Analysis with Tree Automata Inference , 2011, CAV.

[8]  Christopher Krügel,et al.  Effective and Efficient Malware Detection at the End Host , 2009, USENIX Security Symposium.

[9]  Gerardo Canfora,et al.  Static analysis for the detection of metamorphic computer viruses using repeated-instructions counting heuristics , 2013, Journal of Computer Virology and Hacking Techniques.

[10]  Guanhua Yan,et al.  Discriminant malware distance learning on structural information for automated malware classification , 2013, SIGMETRICS.

[11]  Kangbin Yim,et al.  Malware Obfuscation Techniques: A Brief Survey , 2010, 2010 International Conference on Broadband, Wireless Computing, Communication and Applications.

[12]  Yanfang Ye,et al.  IMDS: intelligent malware detection system , 2007, KDD '07.

[13]  Gerardo Canfora,et al.  Metamorphic Malware Detection Using Code Metrics , 2014, Inf. Secur. J. A Glob. Perspect..

[14]  Somesh Jha,et al.  Synthesizing Near-Optimal Malware Specifications from Suspicious Behaviors , 2010, 2010 IEEE Symposium on Security and Privacy.

[15]  Aziz Mohaisen,et al.  Unveiling Zeus: automated classification of malware samples , 2013, WWW.

[16]  Stavros D. Nikolopoulos,et al.  A graph-based model for malicious code detection exploiting dependencies of system-call groups , 2015, CompSysTech '15.

[17]  B. S. Manjunath,et al.  Malware images: visualization and automatic classification , 2011, VizSec '11.

[18]  Lynn Margaret Batten,et al.  Function length as a tool for malware classification , 2008, 2008 3rd International Conference on Malicious and Unwanted Software (MALWARE).

[19]  Peter Szor,et al.  HUNTING FOR METAMORPHIC , 2001 .

[20]  Stavros D. Nikolopoulos,et al.  Detecting Malicious Code by Exploiting Dependencies of System-call Groups , 2014, ArXiv.

[21]  Lynn Batten,et al.  Classification of Malware Based on String and Function Feature Selection , 2010, 2010 Second Cybercrime and Trustworthy Computing Workshop.

[22]  Mark Stamp,et al.  Software Similarity and Metamorphic Detection , 2012 .

[23]  Somesh Jha,et al.  Semantics-aware malware detection , 2005, 2005 IEEE Symposium on Security and Privacy (S&P'05).

[24]  Vinod Yegneswaran,et al.  A comparative assessment of malware classification using binary texture analysis and dynamic analysis , 2011, AISec '11.

[25]  Suhaimi Ibrahim,et al.  Camouflage in Malware: from Encryption to Metamorphism , 2012 .

[26]  Robert Layton,et al.  Malware Detection Based on Structural and Behavioural Features of API Calls , 2010 .

[27]  Robert Luh,et al.  BEHAVIOR BASED MALWARE RECOGNITION , 2011 .

[28]  Christopher Krügel,et al.  Dynamic Analysis of Malicious Code , 2006, Journal in Computer Virology.

[29]  Kang G. Shin,et al.  Large-scale malware indexing using function-call graphs , 2009, CCS.

[30]  Douglas S. Reeves,et al.  Fast malware classification by automated behavioral graph matching , 2010, CSIIRW '10.