Multi-Loss Siamese Neural Network With Batch Normalization Layer for Malware Detection

Malware detection is an essential task in cyber security. As the trend of malicious attacks grows, unknown malware detection with high accuracy becomes more and more challenging. The current deep learning-based approaches for malware detection are typically trained with large amounts of samples using labeled and existing malware families in the training set, thus, their capability to detect new unseen malware (such as a zero-day attack) is limited. To address this issue, we propose a new one-shot model called “Multi-Loss Siamese Neural Network with Batch Normalization Layer” that can work with fewer samples while providing high detection accuracy. Our model utilizes the Siamese Neural Network to detect new variants of malware that is trained with only a few samples. Our model is equipped with batch normalization and multiple loss functions to address the overfitting issue, due to the use of small samples, that can create the vanishing gradient problem as a result of binary cross-entropy loss, and feature embedding space to improve the detection accuracy. In addition, we illustrate a way to convert raw binary files into malware gray scale images, to work with the popular Siamese Neural Network by generating the positive and negative pairs for training. Our experimental results show that our model outperforms existing similar methods.

[1]  Takeshi Yagi,et al.  Malware Detection with Deep Neural Network Using Process Behavior , 2016, 2016 IEEE 40th Annual Computer Software and Applications Conference (COMPSAC).

[2]  Andrew Zisserman,et al.  Smooth Loss Functions for Deep Top-k Classification , 2018, ICLR.

[3]  Wenbo Guo,et al.  Adversary Resistant Deep Neural Networks with an Application to Malware Detection , 2016, KDD.

[4]  Wei Wang,et al.  Effective android malware detection with a hybrid model based on deep autoencoder and convolutional neural network , 2018, Journal of Ambient Intelligence and Humanized Computing.

[5]  Gregory R. Koch,et al.  Siamese Neural Networks for One-Shot Image Recognition , 2015 .

[6]  Yang Wang,et al.  Malware Classification with Deep Convolutional Neural Networks , 2018, 2018 9th IFIP International Conference on New Technologies, Mobility and Security (NTMS).

[7]  Masakiyo Fujimoto,et al.  Comparative Evaluations of Various Factored Deep Convolutional Rnn Architectures for Noise Robust Speech Recognition , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[8]  Jinjun Chen,et al.  Detection of Malicious Code Variants Based on Deep Learning , 2018, IEEE Transactions on Industrial Informatics.

[9]  Razvan Pascanu,et al.  Malware classification with recurrent networks , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[10]  Witawas Srisa-an,et al.  Significant Permission Identification for Machine-Learning-Based Android Malware Detection , 2018, IEEE Transactions on Industrial Informatics.

[11]  B. S. Manjunath,et al.  Malware images: visualization and automatic classification , 2011, VizSec '11.

[12]  Wen-Jyi Hwang,et al.  Fast kNN classification algorithm based on partial distance search , 1998 .

[13]  Hiroshi Sato,et al.  Image-Based Unknown Malware Classification with Few-Shot Learning Models , 2019, 2019 Seventh International Symposium on Computing and Networking Workshops (CANDARW).

[14]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[15]  Guofei Gu,et al.  AUTOPROBE: Towards Automatic Active Malicious Server Probing Using Dynamic Binary Analysis , 2014, CCS.

[16]  Mamoun Alazab,et al.  Towards Understanding Malware Behaviour by the Extraction of API Calls , 2010, 2010 Second Cybercrime and Trustworthy Computing Workshop.

[17]  Henrique S. Malvar,et al.  High-quality linear interpolation for demosaicing of Bayer-patterned color images , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[18]  Marek Krcál,et al.  Deep Convolutional Malware Classifiers Can Learn from Raw Executables and Labels Only , 2018, International Conference on Learning Representations.

[19]  Shou-Ching Hsiao,et al.  Malware Image Classification Using One-Shot Learning with Siamese Networks , 2019, KES.

[20]  Aziz Mohaisen,et al.  Andro-Dumpsys: Anti-malware system based on the similarity of malware creator and malware centric information , 2016, Comput. Secur..

[21]  Jon Barker,et al.  Malware Detection by Eating a Whole EXE , 2017, AAAI Workshops.

[22]  Claudia Eckert,et al.  Adversarial Malware Binaries: Evading Deep Learning for Malware Detection in Executables , 2018, 2018 26th European Signal Processing Conference (EUSIPCO).

[23]  Nagiza F. Samatova,et al.  A Hybrid CNN-RNN Alignment Model for Phrase-Aware Sentence Classification , 2017, EACL.

[24]  Wu Liu,et al.  Siamese neural network based gait recognition for human identification , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[25]  Les E. Atlas,et al.  Recurrent neural networks and robust time series prediction , 1994, IEEE Trans. Neural Networks.

[26]  Arun Kumar Sangaiah,et al.  Android malware detection based on system call sequences and LSTM , 2019, Multimedia Tools and Applications.