Classifying Ransomware Using Machine Learning Algorithms

Ransomware is a continuing threat and has resulted in the battle between the development and detection of new techniques. Detection and mitigation systems have been developed and are in wide-scale use; however, their reactive nature has resulted in a continuing evolution and updating process. This is largely because detection mechanisms can often be circumvented by introducing changes in the malicious code and its behaviour. In this paper, we demonstrate a classification technique of integrating both static and dynamic features to increase the accuracy of detection and classification of ransomware. We train supervised machine learning algorithms using a test set and use a confusion matrix to observe accuracy, enabling a systematic comparison of each algorithm. In this work, supervised algorithms such as the Naive Bayes algorithm resulted in an accuracy of 96% with the test set result, SVM 99.5%, random forest 99.5%, and 96%. We also use Youden’s index to determine sensitivity and specificity.

[1]  D. Gatz,et al.  The standard error of a weighted mean concentration—I. Bootstrapping vs other methods , 1995 .

[2]  Muddassar Farooq,et al.  In-execution dynamic malware analysis and detection by mining information in process control blocks of Linux OS , 2013, Inf. Sci..

[3]  J. Townsend Theoretical analysis of an alphabetic confusion matrix , 1971 .

[4]  Harry Zhang,et al.  The Optimality of Naive Bayes , 2004, FLAIRS.

[5]  Md. Rafiqul Islam,et al.  Classification of malware based on integrated static and dynamic features , 2013, J. Netw. Comput. Appl..

[6]  J. T. Townsend,et al.  Erratum to: Theoretical analysis of an alphabetic confusion matrix. , 1971 .

[7]  Arun Kumar Sangaiah,et al.  Classification of ransomware families with machine learning based on N-gram of opcodes , 2019, Future Gener. Comput. Syst..

[8]  Tina R. Patil,et al.  Performance Analysis of Naive Bayes and J 48 Classification Algorithm for Data Classification , 2013 .

[9]  Sanjay Kumar Sahay,et al.  An effective approach for classification of advanced malware with high accuracy , 2016, ArXiv.

[10]  Simon Parkinson,et al.  Identifying File Interaction Patterns in Ransomware Behaviour , 2018, Guide to Vulnerability Analysis for Computer Networks and Systems.

[11]  Mamoun Alazab,et al.  Profiling and classifying the behavior of malicious codes , 2015, J. Syst. Softw..

[12]  Kevin Jones,et al.  Malware classification using self organising feature maps and machine activity data , 2018, Comput. Secur..

[13]  G. Aghila,et al.  A learning model to detect maliciousness of portable executable using integrated feature set , 2017, J. King Saud Univ. Comput. Inf. Sci..

[14]  P. V. Shijo,et al.  Integrated Static and Dynamic Analysis for Malware Detection , 2015 .

[15]  Salvatore J. Stolfo,et al.  Data mining methods for detection of new malicious executables , 2001, Proceedings 2001 IEEE Symposium on Security and Privacy. S&P 2001.

[16]  P. Vinod,et al.  Investigation of Feature Selection Methods for Android Malware Analysis , 2015 .

[17]  Ali Dehghantanha,et al.  Machine learning aided Android malware classification , 2017, Comput. Electr. Eng..

[18]  Wei Zhang,et al.  Semantics-Based Online Malware Detection: Towards Efficient Real-Time Protection Against Malware , 2016, IEEE Transactions on Information Forensics and Security.

[19]  Zhenkai Liang,et al.  Monet: A User-Oriented Behavior-Based Malware Variants Detection System for Android , 2016, IEEE Transactions on Information Forensics and Security.

[20]  Fei Wang,et al.  ENDMal: An anti-obfuscation and collaborative malware detection system using syscall sequences , 2013, Math. Comput. Model..

[21]  Vasilios Katos,et al.  Differential malware forensics , 2013, Digit. Investig..

[22]  Aziz Mohaisen,et al.  AMAL: High-fidelity, behavior-based automated malware analysis and classification , 2014, Comput. Secur..