Neural Sequential Malware Detection with Parameters

Sequential models which analyze system API calls have shown promise for detecting unknown malware. Athiwaratkun and Stokes recently proposed a two-stage model which uses a long short-term memory (LSTM) model for learning a set of features which are then input to a second classifier. Kolosnjaji et al., first use a convolutional neural network followed by an LSTM to predict unknown malware. However, neither of these models consider the parameters which are input to the system API calls. These input parameters offer significant information regarding malicious intent. In this paper, we extend Athiwaratkun's model to include each system API call's two most input parameters. We then show that the proposed model dominates these previously proposed models in terms of the receiver operating characteristic (ROC) curve.

[1]  Jürgen Schmidhuber,et al.  Learning to forget: continual prediction with LSTM , 1999 .

[2]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[3]  Razvan Pascanu,et al.  Malware classification with recurrent networks , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[4]  Aditya P. Mathur,et al.  A Survey of Malware Detection Techniques , 2007 .

[5]  Jeffrey O. Kephart,et al.  A biologically inspired immune system for computers , 1994 .

[6]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[7]  Jack W. Stokes,et al.  Large-scale malware classification using random projections and neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[8]  Jack W. Stokes,et al.  Malware classification with LSTM and GRU language models and a character-level CNN , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[9]  Claudia Eckert,et al.  Deep Learning for Classification of Malware System Call Sequences , 2016, Australasian Conference on Artificial Intelligence.

[10]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[11]  Xiang Zhang,et al.  Character-level Convolutional Networks for Text Classification , 2015, NIPS.

[12]  P J Webros BACKPROPAGATION THROUGH TIME: WHAT IT DOES AND HOW TO DO IT , 1990 .

[13]  Wenyi Huang,et al.  MtNet: A Multi-Task Neural Network for Dynamic Malware Classification , 2016, DIMVA.

[14]  Konstantin Berlin,et al.  Deep neural network based malware detection using two dimensional binary program features , 2015, 2015 10th International Conference on Malicious and Unwanted Software (MALWARE).