Multi-threaded On-the-Fly Model Generation of Malware with Hash Compaction

This paper introduces multi-threaded implementation of our binary code analyzer BE-PUM for malware. On-the-fly model generation by BE-PUM is combined with duplication detection and hash compaction method to minimize the resource consumption. The method operates in three phases including parallel expansion of states, duplication detection and update of the state space. A notable feature of our algorithm is that it requires very little synchronization or cooperation between threads, which is often a bottleneck of multi-threading, due to our strategy of local resource management. The experiments on 125 real-world malware show good performance improvement.

[1]  A. J. T. Colin The implemention of stab‐1 , 1972 .

[2]  Arun Lakhotia,et al.  A method for detecting obfuscated calls in malicious binaries , 2005, IEEE Transactions on Software Engineering.

[3]  Christopher Krügel,et al.  Exploring Multiple Execution Paths for Malware Analysis , 2007, 2007 IEEE Symposium on Security and Privacy (SP '07).

[4]  Dmitry Kravchenko,et al.  Alternating Control Flow Reconstruction , 2012, VMCAI.

[5]  Chi-Hua Chen,et al.  Model Checking x86 Executables with CodeSurfer/x86 and WPDS++ , 2005, CAV.

[6]  Mizuhito Ogawa,et al.  A Hybrid Approach for Control Flow Graph Construction from Binary Code , 2013, 2013 20th Asia-Pacific Software Engineering Conference (APSEC).

[7]  Pierre Wolper,et al.  Reliable Hashing without Collosion Detection , 1993, CAV.

[8]  David L. Dill,et al.  Improved probabilistic verification by hash compaction , 1995, CHARME.

[9]  Thomas W. Reps,et al.  Analyzing Memory Accesses in x86 Executables , 2004, CC.

[10]  Gerard J. Holzmann,et al.  Design and validation of computer protocols , 1991 .

[11]  Stefan Katzenbeisser,et al.  Detecting Malicious Code by Model Checking , 2005, DIMVA.

[12]  Gerard J. Holzmann,et al.  An improved protocol reachability analysis technique , 1988, Softw. Pract. Exp..

[13]  Tzi-cker Chiueh,et al.  BIRD: binary interpretation using runtime disassembly , 2006, International Symposium on Code Generation and Optimization (CGO'06).

[14]  Thomas W. Reps,et al.  Directed Proof Generation for Machine Code , 2010, CAV.

[15]  Sami Evangelista,et al.  The ComBack Method Revisited: Caching Strategies and Extension with Delayed Duplicate Detection , 2009, Trans. Petri Nets Other Model. Concurr..

[16]  Olivier Ly,et al.  The BINCOA Framework for Binary Code Analysis , 2011, CAV.

[17]  Nguyen Minh Hai,et al.  Obfuscation Code Localization Based on CFG Generation of Malware , 2015, FPS.

[18]  Alfons Laarman,et al.  Boosting multi-core reachability performance with shared hash tables , 2010, Formal Methods in Computer Aided Design.

[19]  Sami Evangelista,et al.  Multi-threaded Explicit State Space Exploration with State Reconstruction , 2013, ATVA.

[20]  James C. King,et al.  Symbolic execution and program testing , 1976, CACM.