Two-Stage Ransomware Detection Using Dynamic Analysis and Machine Learning Techniques

Detecting ransomware is harder than general malware because of the ever-increasing number of ransomwares with different signatures, which makes traditional signature-based detection technique powerless against ransomware. Current ransomware detection techniques usually build a complex model that incorporates various behavioral traits. The traits include suspicious file activities, API call pattern or frequency, registry keys, file extensions, etc. In this paper, we build a two-stage mixed ransomware detection model, Markov model and Random Forest model. First we focus on Windows API call sequence pattern and build a Markov model to capture the characteristics of ransomware. Next we build Random Forest machine learning model to the remaining data in order to control both false positive (FPR) and false negative (FNR) error rates. As a result of our two-stage mixed detection method we can achieve overall accuracy 97.3% with 4.8% FPR and 1.5% FNR.

[1]  Gianluca Stringhini,et al.  MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral Models (Extended Version) , 2016, NDSS 2017.

[2]  Jie He,et al.  Analyzing Malware by Abstracting the Frequent Itemsets in API Call Sequences , 2013, 2013 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications.

[3]  Huy Kang Kim,et al.  Mal-netminer: malware classification based on social network analysis of call graph , 2014, WWW '14 Companion.

[4]  Engin Kirda,et al.  UNVEIL: A large-scale, automated approach to detecting ransomware (keynote) , 2016, SANER.

[5]  Bezawada Bruhadeshwar,et al.  Signature Generation and Detection of Malware Families , 2008, ACISP.

[6]  Daniele Sgandurra,et al.  Automated Dynamic Analysis of Ransomware: Benefits, Limitations and use for Detection , 2016, ArXiv.

[7]  Patrick Traynor,et al.  CryptoLock (and Drop It): Stopping Ransomware Attacks on User Data , 2016, 2016 IEEE 36th International Conference on Distributed Computing Systems (ICDCS).

[8]  Eunjin Kim,et al.  A Novel Approach to Detect Malware Based on API Call Sequence Analysis , 2015, Int. J. Distributed Sens. Networks.

[9]  Keith Marzullo,et al.  Analysis of Computer Intrusions Using Sequences of Function Calls , 2007, IEEE Transactions on Dependable and Secure Computing.

[10]  Srinivas Mukkamala,et al.  Kernel machines for malware classification and similarity analysis , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[11]  Gianluca Stringhini,et al.  PayBreak: Defense Against Cryptographic Ransomware , 2017, AsiaCCS.

[12]  Arun Kumar Sangaiah,et al.  Classification of ransomware families with machine learning based on N-gram of opcodes , 2019, Future Gener. Comput. Syst..

[13]  Kangbin Yim,et al.  Malware Obfuscation Techniques: A Brief Survey , 2010, 2010 International Conference on Broadband, Wireless Computing, Communication and Applications.

[14]  U. Bayer,et al.  TTAnalyze: A Tool for Analyzing Malware , 2006 .

[15]  Jules Desharnais,et al.  Static Detection of Malicious Code in Executable Programs , 2000 .

[16]  Alessandro Barenghi,et al.  ShieldFS: a self-healing, ransomware-aware filesystem , 2016, ACSAC.

[17]  Michal Kedziora,et al.  Theoretical and Practical Aspects of Encrypted Containers Detection - Digital Forensics Approach , 2011 .

[18]  Paul A. Watters,et al.  Zero-day Malware Detection based on Supervised Learning Algorithms of API call Signatures , 2011, AusDM.

[19]  Sanchit Gupta,et al.  Malware Characterization Using Windows API Call Sequences , 2018, SPACE.