Detecting malware variants via function-call graph similarity

Currently, signature-based malware scanning is still the dominant approach to identify malware samples in the wild due to its low false positive rate. However, this approach concentrates on programs' specific instructions, and lacks insight into high level semantics; it is enduring challenges from advanced code obfuscation techniques such as polymorphism and metamorphism. To overcome this shortcoming, this paper extracts a program's function-call graph as its signature. The paper presents a method to compute similarity between two binaries on basis of their function-call graph similarity. The proposed method relies on static analysis of a program, it first disassembles the program into assemble code, and then it uses a novel algorithm to construct the function-call graph from the assembly instructions. After that, it proposes a simple but effective graph matching method to compute similarity between two binaries. A prototype is implemented and evaluated on several well-known malware families and benign programs.

[1]  Somesh Jha,et al.  Testing malware detectors , 2004, ISSTA '04.

[2]  Tzi-cker Chiueh,et al.  Automatic Generation of String Signatures for Malware Detection , 2009, RAID.

[3]  Kang G. Shin,et al.  Large-scale malware indexing using function-call graphs , 2009, CCS.

[4]  Somesh Jha,et al.  Static Analysis of Executables to Detect Malicious Patterns , 2003, USENIX Security Symposium.

[5]  Ludovic Mé,et al.  Code obfuscation techniques for metamorphic viruses , 2008, Journal in Computer Virology.

[6]  Andrew Walenstein,et al.  Constructing malware normalizers using term rewriting , 2008, Journal in Computer Virology.

[7]  Qinghua Zhang,et al.  MetaAware: Identifying Metamorphic Malware , 2007, Twenty-Third Annual Computer Security Applications Conference (ACSAC 2007).

[8]  Somesh Jha,et al.  Semantics-aware malware detection , 2005, 2005 IEEE Symposium on Security and Privacy (S&P'05).

[9]  Mattia Monga,et al.  Detecting Self-mutating Malware Using Control-Flow Graph Matching , 2006, DIMVA.