Detection of global, metamorphic malware variants using control and data flow analysis

Current malware detection and classification tools fail to adequately address variants that are generated automatically using new polymorphic and metamorphic transformation engines that can produce variants that bear no resemblance to one another. Current approaches address this problem by employing syntactic signatures that mimic the underlying control structures such as call- and flow-graphs. These techniques, however, are easily defeated using new program diversification techniques. This hampers our ability to defend against zero day attacks perpetrated by such auto “replicating”, rapidly spreading malware variants. In this paper, we present a new form of abstract malware signature generation that is based on extracting semantic summaries of malware code that is immune to most polymorphic and metamorphic transformations. We also present results of our initial, experimental evaluation of the proposed approach.

[1]  Lori A. Flynn,et al.  Polymorphic malware detection and identification via context-free grammar homomorphism , 2007 .

[2]  Kang G. Shin,et al.  Large-scale malware indexing using function-call graphs , 2009, CCS.

[3]  Peter Szor,et al.  HUNTING FOR METAMORPHIC , 2001 .

[4]  Arun Lakhotia,et al.  Using engine signature to detect metamorphic malware , 2006, WORM '06.

[5]  Mattia Monga,et al.  Using Code Normalization for Fighting Self-Mutating Malware , 2006, ISSSE.

[6]  Gran Vía,et al.  GRAPHS, ENTROPY AND GRID COMPUTING: AUTOMATIC COMPARISON OF MALWARE , 2008 .

[7]  Hira Agrawal,et al.  Efficient coverage testing using global dominator graphs , 1999, PASTE '99.

[8]  Christopher Krügel,et al.  Polymorphic Worm Detection Using Structural Information of Executables , 2005, RAID.

[9]  Li Yujian,et al.  A Normalized Levenshtein Distance Metric , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Kaizhong Zhang,et al.  Simple Fast Algorithms for the Editing Distance Between Trees and Related Problems , 1989, SIAM J. Comput..

[11]  Robert E. Tarjan,et al.  A fast algorithm for finding dominators in a flowgraph , 1979, TOPL.

[12]  Mark Weiser,et al.  Program Slicing , 1981, IEEE Transactions on Software Engineering.

[13]  Enrique Vidal,et al.  New formulation and improvements of the nearest-neighbour approximating and eliminating search algorithm (AESA) , 1994, Pattern Recognit. Lett..

[14]  Marius Gheorghescu AN AUTOMATED VIRUS CLASSIFICATION SYSTEM , 2006 .

[15]  David W. Binkley,et al.  Interprocedural slicing using dependence graphs , 1990, TOPL.

[16]  Shane Snyder,et al.  Guest-transparent instruction authentication for self-patching kernels , 2012, MILCOM 2012 - 2012 IEEE Military Communications Conference.

[17]  Thomas Dullien,et al.  Graph-based comparison of Executable Objects , 2005 .

[18]  Hiralal Agrawal,et al.  Dominators, super blocks, and program coverage , 1994, POPL '94.

[19]  Enrique V. Carrera,et al.  Digital genome mapping: ad-vanced binary malware analysis , 2004 .