Deep Learning Framework and Visualization for Malware Classification

In this paper we propose a deep learning framework for classification of malware. There has been an enormous increase in the volume of malware generated lately which represents a genuine security danger to organizations and people. So as to battle the expansion of malwares, new strategies are needed to quickly identify and classify malware. Malimg dataset, a publicly available benchmark data set was used for the experimentation. The architecture used in this work is a hybrid cost-sensitive network of one-dimensional Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) network which obtained an accuracy of 94.4%, an increase in performance compared to work done by [1] which got 84.9%. Hyper parameter tuning is done on deep learning architecture to set the parameters. A learning rate of 0.01 was taken for all experiments. Train-test split of 70-30% was done during experimentation. This facilitates to find how well the models perform on imbalanced data sets. Usual methods like disassembly, decompiling, de-obfuscation or execution of the binary need not be done in this proposed method. The source code and the trained models are made publicly available for further research.

[1]  Guanghui Liang,et al.  Image classification for malware detection using extremely randomized trees , 2017, 2017 11th IEEE International Conference on Anti-counterfeiting, Security, and Identification (ASID).

[2]  Dan Chia-Tien Lo,et al.  Binary malware image classification using machine learning with local binary pattern , 2017, 2017 IEEE International Conference on Big Data (Big Data).

[3]  B. S. Manjunath,et al.  Malware images: visualization and automatic classification , 2011, VizSec '11.

[4]  Felan Carlo C. Garcia,et al.  Random Forest for Malware Classification , 2016, ArXiv.

[5]  Hui Li,et al.  A malware classification method based on memory dump grayscale image , 2018, Digit. Investig..

[6]  Abien Fred Agarap,et al.  Towards Building an Intelligent Anti-Malware System: A Deep Learning Approach using Support Vector Machine (SVM) for Malware Classification , 2017, ArXiv.

[7]  Eul Gyu Im,et al.  Malware analysis using visualized images and entropy graphs , 2014, International Journal of Information Security.

[8]  Vinod Yegneswaran,et al.  A comparative assessment of malware classification using binary texture analysis and dynamic analysis , 2011, AISec '11.

[9]  K. P. Soman,et al.  A Detailed Investigation and Analysis of Deep Learning Architectures and Visualization Techniques for Malware Family Identification , 2019, Advanced Sciences and Technologies for Security Applications.

[10]  Songqing Yue,et al.  Imbalanced Malware Images Classification: a CNN based Approach , 2017, ArXiv.

[11]  Hai Anh Tran,et al.  A LSTM based framework for handling multiclass imbalance in DGA botnet detection , 2018, Neurocomputing.

[12]  K. P. Soman,et al.  Evaluating effectiveness of shallow and deep networks to intrusion detection system , 2017, 2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI).

[13]  Prabaharan Poornachandran,et al.  Scalable Framework for Cyber Threat Situational Awareness Based on Domain Name Systems Data Analysis , 2018 .

[14]  K. P. Soman,et al.  Evaluation of Recurrent Neural Network and its Variants for Intrusion Detection System (IDS) , 2017, Int. J. Inf. Syst. Model. Des..

[15]  Aziz Makandar,et al.  Malware class recognition using image processing techniques , 2017, 2017 International Conference on Data Management, Analytics and Innovation (ICDMAI).

[16]  Chang Hoon Kim,et al.  Classifying malware using convolutional gated neural network , 2018, 2018 20th International Conference on Advanced Communication Technology (ICACT).

[17]  P SomanK.,et al.  S.P.O.O.F Net: Syntactic Patterns for identification of Ominous Online Factors , 2018, 2018 IEEE Security and Privacy Workshops (SPW).

[18]  K. P. Soman,et al.  Detecting malicious domain names using deep learning approaches at scale , 2018, J. Intell. Fuzzy Syst..

[19]  R. Vinayakumar,et al.  DeepMalNet: Evaluating shallow and deep networks for static PE malware detection , 2018, ICT Express.

[20]  Zhi-Hua Zhou,et al.  Ieee Transactions on Knowledge and Data Engineering 1 Training Cost-sensitive Neural Networks with Methods Addressing the Class Imbalance Problem , 2022 .

[21]  Rui Zhang,et al.  Malware identification using visualization images and deep learning , 2018, Comput. Secur..

[22]  Alex Graves,et al.  Long Short-Term Memory , 2020, Computer Vision.

[23]  Prabaharan Poornachandran,et al.  ScaleNet: Scalable and Hybrid Frameworkfor Cyber Threat Situational AwarenessBased on DNS, URL, and Email Data Analysis , 2019, J. Cyber Secur. Mobil..

[24]  Daniel Gibert,et al.  Using convolutional neural networks for classification of malware represented as images , 2018, Journal of Computer Virology and Hacking Techniques.