A Fast Flowgraph Based Classification System for Packed and Polymorphic Malware on the Endhost

Identifying malicious software provides great benefit for distributed and networked systems. Traditional real-time malware detection has relied on using signatures and string matching. However, string signatures ineffectively deal with polymorphic malware variants. Control flow has been proposed as an alternative signature that can be identified across such variants. This paper proposes a novel classification system to detect polymorphic variants using flowgraphs. We propose using an existing heuristic flowgraph matching algorithm to estimate graph isomorphisms. Moreover, we can determine similarity between programs by identifying the underlying isomorphic flowgraphs. A high similarity between the query program and known malware identifies a variant. To demonstrate the effectiveness and efficiency of our flowgraph based classification, we compare it to alternate algorithms, and evaluate the system using real and synthetic malware. The evaluation shows our system accurately detects real malware, performs efficiently, and is scalable. These performance characteristics enable real-time use on an intermediary node such as an Email gateway, or on the end host.

[1]  Peter N. Yianilos,et al.  Data structures and algorithms for nearest neighbor search in general metric spaces , 1993, SODA '93.

[2]  Kang G. Shin,et al.  Large-scale malware indexing using function-call graphs , 2009, CCS.

[3]  Gran Vía,et al.  GRAPHS, ENTROPY AND GRID COMPUTING: AUTOMATIC COMPARISON OF MALWARE , 2008 .

[4]  Alfred V. Aho,et al.  Efficient string matching , 1975, Commun. ACM.

[5]  Enrique V. Carrera,et al.  Digital genome mapping: ad-vanced binary malware analysis , 2004 .

[6]  Wenke Lee,et al.  PolyUnpack: Automating the Hidden-Code Extraction of Unpack-Executing Malware , 2006, 2006 22nd Annual Computer Security Applications Conference (ACSAC'06).

[7]  Wenke Lee,et al.  Rotalumè: A Tool for Automatic Reverse Engineering of Malware Emulators , 2009 .

[8]  Robert Lyda,et al.  Using Entropy Analysis to Find Encrypted and Packed Malware , 2007, IEEE Security & Privacy.

[9]  Somesh Jha,et al.  OmniUnpack: Fast, Generic, and Safe Unpacking of Malware , 2007, Twenty-Third Annual Computer Security Applications Conference (ACSAC 2007).

[10]  Guillaume Bonfante,et al.  Morphological detection of malware , 2008, 2008 3rd International Conference on Malicious and Unwanted Software (MALWARE).

[11]  J. Olson,et al.  Energy Storage and the Balance of Producers and Decomposers in Ecological Systems , 1963 .

[12]  Christopher Krügel,et al.  Polymorphic Worm Detection Using Structural Information of Executables , 2005, RAID.

[13]  Tzi-cker Chiueh,et al.  Automatic Generation of String Signatures for Malware Detection , 2009, RAID.

[14]  Rolf Rolles,et al.  Unpacking Virtualization Obfuscators , 2009, WOOT.

[15]  Marius Gheorghescu AN AUTOMATED VIRUS CLASSIFICATION SYSTEM , 2006 .

[16]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[17]  Yang Xiang,et al.  Classification of malware using structured control flow , 2010 .

[18]  T. Dullien,et al.  Graph-based comparison of Executable Objects ( English Version ) , 2005 .

[19]  Marcus A. Maloof,et al.  Learning to detect malicious executables in the wild , 2004, KDD.

[20]  R. Nigel Horspool,et al.  An Approach to the Problem of Detranslation of Computer Programs , 1980, Comput. J..

[21]  Andrew Walenstein,et al.  Malware phylogeny generation using permutations of code , 2005, Journal in Computer Virology.

[22]  Christopher Krügel,et al.  Static Disassembly of Obfuscated Binaries , 2004, USENIX Security Symposium.

[23]  Yanfang Ye,et al.  IMDS: intelligent malware detection system , 2007, KDD '07.

[24]  E. Gorham On the Chemical Composition of Some Waters from the Moor House Nature Reserve , 1956 .

[25]  Debin Gao,et al.  BinHunt: Automatically Finding Semantic Differences in Binary Programs , 2008, ICICS.

[26]  Wenke Lee,et al.  McBoost: Boosting Scalability in Malware Collection and Analysis Using Statistical Classification of Executables , 2008, 2008 Annual Computer Security Applications Conference (ACSAC).

[27]  rey O. Kephart,et al.  Automatic Extraction of Computer Virus SignaturesJe , 2006 .