A New Learning Approach to Malware Classification Using Discriminative Feature Extraction

With the development of the Internet, malware has become one of the most significant threats. Recognizing specific types of malware is an important step toward effective removal. Malware visualization is an important branch of malware static analysis techniques, where a piece of malware is turned into an image for visualization and classification. Despite great success, it is still difficult to extract effective texture feature representations for challenging datasets. The existing methods use global image features which are sensitive to relative code locations. In this paper, we present a new learning framework to obtain more discriminative and robust feature descriptors. The proposed method works with the existing local descriptors such as local binary patterns and dense scale-invariant feature transform, by grouping them into blocks and by using a new bag-of-visual-words model to obtain robust features, which are more flexible than global features and more robust than local features. We evaluate the proposed method on three malware databases. The experimental results demonstrate that the obtained descriptors lead to the state-of-the-art classification performance.

[1]  Bezawada Bruhadeshwar,et al.  Signature Generation and Detection of Malware Families , 2008, ACISP.

[2]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[3]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[4]  Nathan S. Netanyahu,et al.  DeepSign: Deep learning for automatic malware signature generation and classification , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).

[5]  Felix C. Freiling,et al.  Visual analysis of malware behavior using treemaps and thread graphs , 2009, 2009 6th International Workshop on Visualization for Cyber Security.

[6]  Thambipillai Srikanthan,et al.  Low-Complexity Signature-Based Malware Detection for IoT Devices , 2017, ATIS.

[7]  Yang Xiang,et al.  Classification of malware using structured control flow , 2010 .

[8]  Aziz Mohaisen,et al.  AMAL: High-Fidelity, Behavior-Based Automated Malware Analysis and Classification , 2014, WISA.

[9]  Yoseba K. Penya,et al.  Idea: Opcode-Sequence-Based Malware Detection , 2010, ESSoS.

[10]  Deepti Vidyarthi,et al.  Malware Detection Using API Function Frequency with Ensemble Based Classifier , 2013, SSCC.

[11]  Eul Gyu Im,et al.  Malware analysis using visualized images and entropy graphs , 2014, International Journal of Information Security.

[12]  Joshua Saxe,et al.  Visualization of shared system call sequence relationships in large malware corpora , 2012, VizSec '12.

[13]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[14]  Yang Xiang,et al.  A Fast Flowgraph Based Classification System for Packed and Polymorphic Malware on the Endhost , 2010, 2010 24th IEEE International Conference on Advanced Information Networking and Applications.

[15]  Kieran McLaughlin,et al.  Detecting obfuscated malware using reduced opcode set and optimised runtime trace , 2016, Security Informatics.

[16]  Sergey Bratus,et al.  Automated mapping of large binary objects using primitive fragment type classification , 2010, Digit. Investig..

[17]  Claudia Eckert,et al.  Deep Learning for Classification of Malware System Call Sequences , 2016, Australasian Conference on Artificial Intelligence.

[18]  Yong Chen,et al.  Automatic malware categorization using cluster ensemble , 2010, KDD.

[19]  Christopher Krügel,et al.  Behavior-based Spyware Detection , 2006, USENIX Security Symposium.

[20]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[21]  B. S. Manjunath,et al.  Malware images: visualization and automatic classification , 2011, VizSec '11.

[22]  Kang G. Shin,et al.  Large-scale malware indexing using function-call graphs , 2009, CCS.

[23]  Igor Santos,et al.  Using opcode sequences in single-class learning to detect unknown malware , 2011, IET Inf. Secur..

[24]  Eul Gyu Im,et al.  Malware classification using instruction frequencies , 2011, RACS.

[25]  KyoungSoo Han,et al.  Malware Analysis Using Visualized Image Matrices , 2014, TheScientificWorldJournal.

[26]  Youssef B. Mahdy,et al.  Behavior-based features model for malware detection , 2016, Journal of Computer Virology and Hacking Techniques.

[27]  Matti Pietikäinen,et al.  Performance evaluation of texture measures with classification based on Kullback discrimination of distributions , 1994, Proceedings of 12th International Conference on Pattern Recognition.