Radial Spike and Slab Bayesian Neural Networks for Sparse Data in Ransomware Attacks

Ransomware attacks are increasing at an alarming rate, leading to large financial losses, unrecoverable encrypted data, data leakage, and privacy concerns. The prompt detection of ransomware attacks is required to minimize further damage, particularly during the encryption stage. However, the frequency and structure of the observed ransomware attack data makes this task difficult to accomplish in practice. The data corresponding to ransomware attacks represents temporal, high-dimensional sparse signals, with limited records and very imbalanced classes. While traditional deep learning models have been able to achieve state-of-the-art results in a wide variety of domains, Bayesian Neural Networks, which are a class of probabilistic models, are better suited to the issues of the ransomware data. These models combine ideas from Bayesian statistics with the rich expressive power of neural networks. In this paper, we propose the Radial Spike and Slab Bayesian Neural Network, which is a new type of Bayesian Neural network that includes a new form of the approximate posterior distribution. The model scales well to large architectures and recovers the sparse structure of target functions. We provide a theoretical justification for using this type of distribution, as well as a computationally efficient method to perform variational inference. We demonstrate the performance of our model on a real dataset of ransomware attacks and show improvement over a large number of baselines, including state-of-the-art models such as Neural ODEs (ordinary differential equations). In addition, we propose to represent low-level events as MITRE ATT\&CK tactics, techniques, and procedures (TTPs) which allows the model to better generalize to unseen ransomware attacks.

[1]  David Noever,et al.  A Survey of Machine Learning Algorithms for Detecting Ransomware Encryption Activity , 2021, ArXiv.

[2]  A. S. M. Kayes,et al.  Ransomware Mitigation in the Modern Era: A Comprehensive Review, Research Challenges, and Future Directions , 2021, ACM Comput. Surv..

[3]  Vikas Singh,et al.  Graph reparameterizations for enabling 1000+ Monte Carlo iterations in Bayesian deep neural networks , 2021, UAI.

[4]  Richard E. Turner,et al.  Bayesian Neural Network Priors Revisited , 2021, ICLR.

[5]  A. Uluagac,et al.  A Survey on Ransomware: Evolution, Taxonomy, and Defense Solutions , 2021, ACM Comput. Surv..

[6]  Bander Ali Saleh Al-rimy,et al.  A proposed Adaptive Pre-Encryption Crypto-Ransomware Early Detection Model , 2021, 2021 3rd International Cyber Resilience Conference (CRC).

[7]  Anders Carlsson,et al.  Reinforcement Learning for Anti-Ransomware Testing , 2020, 2020 IEEE East-West Design & Test Symposium (EWDTS).

[8]  Jos'e A. Perusqu'ia,et al.  Bayesian Models Applied to Cyber Security Anomaly Detection Problems , 2020, International Statistical Review.

[9]  Steve Kroon,et al.  Stabilising priors for robust Bayesian deep learning , 2019, ArXiv.

[10]  R. Krishnan,et al.  Efficient Priors for Scalable Variational Inference in Bayesian Deep Neural Networks , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[11]  Elena Sitnikova,et al.  Industrial Internet of Things Based Ransomware Detection using Stacked Variational Neural Network , 2019, BDIOT 2019.

[12]  Michael A. Osborne,et al.  Radial Bayesian Neural Networks: Beyond Discrete Support In Large-Scale Bayesian Deep Learning , 2019, AISTATS.

[13]  Jack W. Stokes,et al.  Attention in Recurrent Neural Networks for Ransomware Detection , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[14]  Sakir Sezer,et al.  A Multi-Classifier Network-Based Crypto Ransomware Detection System: A Case Study of Locky Ransomware , 2019, IEEE Access.

[15]  Arun Kumar Sangaiah,et al.  Classification of ransomware families with machine learning based on N-gram of opcodes , 2019, Future Gener. Comput. Syst..

[16]  Dipankar Dasgupta,et al.  A Framework for Analyzing Ransomware using Machine Learning , 2018, 2018 IEEE Symposium Series on Computational Intelligence (SSCI).

[17]  Shina Sheen,et al.  Ransomware detection by mining API call usage , 2018, 2018 International Conference on Advances in Computing, Communications and Informatics (ICACCI).

[18]  Satoshi Fukumoto,et al.  Detecting Ransomware using Support Vector Machines , 2018, ICPP Workshops.

[19]  David Duvenaud,et al.  Neural Ordinary Differential Equations , 2018, NeurIPS.

[20]  Jacek Tabor,et al.  Processing of missing data by neural networks , 2018, NeurIPS.

[21]  Ali Dehghantanha,et al.  Leveraging Machine Learning Techniques for Windows Ransomware Network Traffic Detection , 2018, ArXiv.

[22]  Soumya Ghosh,et al.  Model Selection in Bayesian Neural Networks via Horseshoe Priors , 2017, J. Mach. Learn. Res..

[23]  Alexander D'Amour,et al.  Reducing Reparameterization Gradient Variance , 2017, NIPS.

[24]  Ankur Taly,et al.  Axiomatic Attribution for Deep Networks , 2017, ICML.

[25]  Ben Poole,et al.  Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.

[26]  Yee Whye Teh,et al.  The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.

[27]  Christine M. Anderson-Cook,et al.  Bayesian Networks with Prior Knowledge for Malware Phylogenetics , 2016, AAAI Workshop: Artificial Intelligence for Cyber Security.

[28]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[29]  Stefano Zanero,et al.  HelDroid: Dissecting and Detecting Mobile Ransomware , 2015, RAID.

[30]  Diederik P. Kingma,et al.  Variational Dropout and the Local Reparameterization Trick , 2015, NIPS.

[31]  Julien Cornebise,et al.  Weight Uncertainty in Neural Networks , 2015, ArXiv.

[32]  Gyunyoung Heo,et al.  Development of a cyber security risk model using Bayesian networks , 2015, Reliab. Eng. Syst. Saf..

[33]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[34]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[35]  N. Pillai,et al.  Dirichlet–Laplace Priors for Optimal Shrinkage , 2014, Journal of the American Statistical Association.

[36]  Sean Gerrish,et al.  Black Box Variational Inference , 2013, AISTATS.

[37]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[38]  Hyun Kang The prevention and handling of the missing data , 2013, Korean journal of anesthesiology.

[39]  Noah Simon,et al.  A Sparse-Group Lasso , 2013 .

[40]  Michael I. Jordan,et al.  Variational Bayesian Inference with Stochastic Search , 2012, ICML.

[41]  Alex Graves,et al.  Practical Variational Inference for Neural Networks , 2011, NIPS.

[42]  Stef van Buuren,et al.  MICE: Multivariate Imputation by Chained Equations in R , 2011 .

[43]  Jean-Philippe Vert,et al.  Group lasso with overlap and graph lasso , 2009, ICML '09.

[44]  P. Bühlmann,et al.  The group lasso for logistic regression , 2008 .

[45]  Tariq Samad,et al.  Imputation of Missing Data in Industrial Databases , 1999, Applied Intelligence.

[46]  E. George,et al.  APPROACHES FOR BAYESIAN VARIABLE SELECTION , 1997 .

[47]  Peter K. Sharpe,et al.  Dealing with missing values in neural network-based diagnostic systems , 1995, Neural Computing & Applications.

[48]  T. J. Mitchell,et al.  Bayesian Variable Selection in Linear Regression , 1988 .

[49]  Aggelos K. Katsaggelos,et al.  Bayesian Compressive Sensing Using Laplace Priors , 2010, IEEE Transactions on Image Processing.

[50]  James G. Scott,et al.  Handling Sparsity via the Horseshoe , 2009, AISTATS.

[51]  Gustavo E. A. P. A. Batista,et al.  A Study of K-Nearest Neighbour as an Imputation Method , 2002, HIS.