Recurrent Neural Networks Based Online Behavioural Malware Detection Techniques for Cloud Infrastructure

Several organizations are utilizing cloud technologies and resources to run a range of applications. These services help businesses save on hardware management, scalability and maintainability concerns of underlying infrastructure. Key cloud service providers (CSPs) like Amazon, Microsoft and Google offer Infrastructure as a Service (IaaS) to meet the growing demand of such enterprises. This increased utilization of cloud platforms has made it an attractive target to the attackers, thereby, making the security of cloud services a top priority for CSPs. In this respect, malware has been recognized as one of the most dangerous and destructive threats to cloud infrastructure (IaaS). In this paper, we study the effectiveness of Recurrent Neural Networks (RNNs) based deep learning techniques for detecting malware in cloud Virtual Machines (VMs). We focus on two major RNN architectures: Long Short Term Memory RNNs (LSTMs) and Bidirectional RNNs (BIDIs). These models learn the behavior of malware over time based on run-time fine-grained processes system features such as CPU, memory, and disk utilization. We evaluate our approach on a dataset of 40,680 malicious and benign samples. The process level features were collected using real malware running in an open online cloud environment with no restrictions, which is important to emulate practical cloud provider settings and also capture the true behaviour of stealth and sophisticated malware. Both our LSTM and BIDI models achieve high detection rates over 99% for different evaluation metrics. In addition, an analysis study is conducted to understand the significance of input data representations. Our results suggest that in particular cases, input ordering does have some affect on the performance of the trained RNN models.

[1]  Razvan Pascanu,et al.  Malware classification with recurrent networks , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[2]  Kuldip K. Paliwal,et al.  Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..

[3]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Jens Myrup Pedersen,et al.  Analysis of Malware behavior: Type classification using machine learning , 2015, 2015 International Conference on Cyber Situational Awareness, Data Analytics and Assessment (CyberSA).

[5]  Ravi S. Sandhu,et al.  Clustering-Based IaaS Cloud Monitoring , 2017, 2017 IEEE 10th International Conference on Cloud Computing (CLOUD).

[6]  Aziz Alotaibi,et al.  Identifying Malicious Software Using Deep Residual Long-Short Term Memory , 2019, IEEE Access.

[7]  Preeti Mishra,et al.  VMAnalyzer: Malware Semantic Analysis using Integrated CNN and Bi-Directional LSTM for Detecting VM-level Attacks in Cloud , 2019, 2019 Twelfth International Conference on Contemporary Computing (IC3).

[8]  Erwin Laure,et al.  Security and Privacy of Sensitive Data in Cloud Computing: A Survey of Recent Developments , 2015, NeTCoM 2015.

[9]  Kevin Jones,et al.  Early Stage Malware Prediction Using Recurrent Neural Networks , 2017, Comput. Secur..

[10]  David Kaeli,et al.  Virtual machine monitor-based lightweight intrusion detection , 2011, OPSR.

[11]  Salvatore J. Stolfo,et al.  On the feasibility of online malware detection with performance counters , 2013, ISCA.

[12]  Alva Erwin,et al.  Analysis of Machine learning Techniques Used in Behavior-Based Malware Detection , 2010, 2010 Second International Conference on Advances in Computing, Control, and Telecommunication Technologies.

[13]  Sayak Ray,et al.  Malware detection using machine learning based analysis of virtual memory access patterns , 2017, Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017.

[14]  Joel A. Dawson,et al.  Neural Network Analysis of System Call Timing for Rootkit Detection , 2016, 2016 Cybersecurity Symposium (CYBERSEC).

[15]  Yanfang Ye,et al.  Malicious sequential pattern mining for automatic malware detection , 2016, Expert Syst. Appl..

[16]  Ali Dehghantanha,et al.  Machine Learning Aided Static Malware Analysis: A Survey and Tutorial , 2018, ArXiv.

[17]  Mudzfirah Abdul Halim,et al.  Recurrent Neural Network for Malware Detection , 2019 .

[18]  Claudia Eckert,et al.  Deep Learning for Classification of Malware System Call Sequences , 2016, Australasian Conference on Artificial Intelligence.

[19]  Arun Kumar Sangaiah,et al.  Android malware detection based on system call sequences and LSTM , 2019, Multimedia Tools and Applications.

[20]  Nael B. Abu-Ghazaleh,et al.  Malware-aware processors: A framework for efficient online malware detection , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).

[21]  Ravi S. Sandhu,et al.  Online Malware Detection in Cloud Auto-scaling Systems Using Shallow Convolutional Neural Networks , 2019, DBSec.

[22]  David Hutchison,et al.  Malware Detection in Cloud Computing Infrastructures , 2016, IEEE Transactions on Dependable and Secure Computing.

[23]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Ziming Zhang,et al.  Ensemble of Bayesian Predictors and Decision Trees for Proactive Failure Management in Cloud Computing Systems , 2012, J. Commun..

[25]  Todd R. Andel,et al.  Phase Space Detection of Virtual Machine Cyber Events Through Hypervisor-Level System Call Analysis , 2018, 2018 1st International Conference on Data Intelligence and Security (ICDIS).

[26]  Shihong Zou,et al.  A System-call Behavior Language System for Malware Detection Using A Sensitivity-based LSTM Model , 2020, CSSE.

[27]  Kamal Dahbur,et al.  A survey of risks, threats and vulnerabilities in cloud computing , 2011, ISWSA '11.

[28]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[29]  Ravi S. Sandhu,et al.  Malware Detection in Cloud Infrastructures Using Convolutional Neural Networks , 2018, 2018 IEEE 11th International Conference on Cloud Computing (CLOUD).

[30]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[31]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[32]  Zahir Tari,et al.  Security and Privacy in Cloud Computing , 2014, IEEE Cloud Computing.

[33]  Ram Krishnan,et al.  Analyzing CNN Model Performance Sensitivity to the Ordering of Non-Natural Data , 2019, 2019 4th International Conference on Computing, Communications and Security (ICCCS).

[34]  Mahmoud Abdelsalam,et al.  Deep Learning Techniques for Behavioral Malware Analysis in Cloud IaaS , 2020, Malware Analysis Using Artificial Intelligence and Deep Learning.

[35]  Fatih Ilhan,et al.  Markovian RNN: An Adaptive Time Series Prediction Network With HMM-Based Switching for Nonstationary Environments , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[36]  Takeshi Yagi,et al.  Malware Detection with Deep Neural Network Using Process Behavior , 2016, 2016 IEEE 40th Annual Computer Software and Applications Conference (COMPSAC).

[37]  P. Mell,et al.  The NIST Definition of Cloud Computing , 2011 .

[38]  Yi Sun,et al.  Malware Detection Based on Deep Learning of Behavior Graphs , 2019, Mathematical Problems in Engineering.

[39]  Mahmoud Abdelsalam,et al.  Analyzing CNN Based Behavioural Malware Detection Techniques on Cloud IaaS , 2020, CLOUD.

[40]  Santosh Joshi,et al.  Machine Learning Approach for Malware Detection Using Random Forest Classifier on Process List Data Structure , 2018, ICISDM '18.