Malware Analysis Using Visualized Image Matrices

This paper proposes a novel malware visual analysis method that contains not only a visualization method to convert binary files into images, but also a similarity calculation method between these images. The proposed method generates RGB-colored pixels on image matrices using the opcode sequences extracted from malware samples and calculates the similarities for the image matrices. Particularly, our proposed methods are available for packed malware samples by applying them to the execution traces extracted through dynamic analysis. When the images are generated, we can reduce the overheads by extracting the opcode sequences only from the blocks that include the instructions related to staple behaviors such as functions and application programming interface (API) calls. In addition, we propose a technique that generates a representative image for each malware family in order to reduce the number of comparisons for the classification of unknown samples and the colored pixel information in the image matrices is used to calculate the similarities between the images. Our experimental results show that the image matrices of malware can effectively be used to classify malware families both statically and dynamically with accuracy of 0.9896 and 0.9732, respectively.

[1]  Gran Vía,et al.  GRAPHS, ENTROPY AND GRID COMPUTING: AUTOMATIC COMPARISON OF MALWARE , 2008 .

[2]  Wenke Lee,et al.  PolyUnpack: Automating the Hidden-Code Extraction of Unpack-Executing Malware , 2006, 2006 22nd Annual Computer Security Applications Conference (ACSAC'06).

[3]  Babak Bashari Rad,et al.  Metamorphic Virus Variants Classification Using Opcode Frequency Histogram , 2011, ArXiv.

[4]  Terran Lane,et al.  Improving malware classification: bridging the static/dynamic gap , 2012, AISec.

[5]  B. S. Manjunath,et al.  Malware images: visualization and automatic classification , 2011, VizSec '11.

[6]  Konstantinos N. Plataniotis,et al.  A Novel Vector-Based Approach to Color Image Retrieval Using a Vector Angular-Based Distance Measure , 1999, Comput. Vis. Image Underst..

[7]  Eul Gyu Im,et al.  Malware categorization using dynamic mnemonic frequency analysis with redundancy filtering , 2014, Digit. Investig..

[8]  Yousun Kang,et al.  Image Categorization and Semantic Segmentation using Scale-Optimized Textons , 2014 .

[9]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[10]  Michael D. Ernst Static and dynamic analysis: synergy and duality , 2003 .

[11]  Eul Gyu Im,et al.  Malware analysis using visualized images and entropy graphs , 2015, International Journal of Information Security.

[12]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[13]  Andrew H. Sung,et al.  Static analyzer of vicious executables (SAVE) , 2004, 20th Annual Computer Security Applications Conference.

[14]  Yang Xiang,et al.  A Fast Flowgraph Based Classification System for Packed and Polymorphic Malware on the Endhost , 2010, 2010 24th IEEE International Conference on Advanced Information Networking and Applications.

[15]  Daniel Bilar,et al.  Opcodes as predictor for malware , 2007, Int. J. Electron. Secur. Digit. Forensics.

[16]  Eul Gyu Im,et al.  Malware analysis method using visualization of binary files , 2013, RACS.

[17]  Andrew Walenstein,et al.  Exploiting Similarity Between Variants to Defeat Malware “ Vilo ” Method for Comparing and Searching Binary Programs , 2007 .

[18]  Chris Eagle,et al.  The IDA Pro Book: The Unofficial Guide to the World's Most Popular Disassembler , 2008 .

[19]  Heng Yin,et al.  Dynamic Spyware Analysis , 2007, USENIX Annual Technical Conference.

[20]  Gregory J. Conti,et al.  Visual Reverse Engineering of Binary and Data Files , 2008, VizSEC.

[21]  Christopher Krügel,et al.  Limits of Static Analysis for Malware Detection , 2007, Twenty-Third Annual Computer Security Applications Conference (ACSAC 2007).

[22]  Joris Kinable,et al.  Malware classification based on call graph clustering , 2010, Journal in Computer Virology.

[23]  Guillaume Bonfante,et al.  Architecture of a morphological malware detector , 2009, Journal in Computer Virology.

[24]  Muhammad Zubair Shafiq,et al.  Malware detection using statistical analysis of byte-level file content , 2009, CSI-KDD '09.

[25]  Moses Charikar,et al.  Similarity estimation techniques from rounding algorithms , 2002, STOC '02.

[26]  Eul Gyu Im,et al.  Malware classification method via binary content comparison , 2012, RACS.

[27]  Antonio Torralba,et al.  Context-based vision system for place and object recognition , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[28]  Jian Xu,et al.  Detecting malware variants via function-call graph similarity , 2010, 2010 5th International Conference on Malicious and Unwanted Software.

[29]  Joshua Saxe,et al.  Visualization of shared system call sequence relationships in large malware corpora , 2012, VizSec '12.

[30]  Yoseba K. Penya,et al.  Idea: Opcode-Sequence-Based Malware Detection , 2010, ESSoS.

[31]  Vinod Yegneswaran,et al.  A comparative assessment of malware classification using binary texture analysis and dynamic analysis , 2011, AISec '11.

[32]  Somesh Jha,et al.  Synthesizing Near-Optimal Malware Specifications from Suspicious Behaviors , 2010, 2010 IEEE Symposium on Security and Privacy.

[33]  Xian-Guo Zhang,et al.  APICapture - A tool for monitoring the behavior of malware , 2010, 2010 3rd International Conference on Advanced Computer Theory and Engineering(ICACTE).

[34]  Somesh Jha,et al.  Testing malware detectors , 2004, ISSTA '04.

[35]  Nirwan Ansari,et al.  Revealing Packed Malware , 2008, IEEE Security & Privacy.

[36]  Felix C. Freiling,et al.  Visual analysis of malware behavior using treemaps and thread graphs , 2009, 2009 6th International Workshop on Visualization for Cyber Security.

[37]  Vijay Laxmi,et al.  MEDUSA: MEtamorphic malware dynamic analysis usingsignature from API , 2010, SIN.