Methods for Training Neural Networks with Zero False Positives for Malware Detection

With the increase in malware samples in the last decade more antivirus products started to use machine learning algorithms in order to cope with the large volume of data. Thanks to the good results and advances in learning infrastructure the neural networks have become one of the preferred way of addressing this. However, these algorithms need to be fine tuned in order to not add an overhead of costly false positives. This paper presents a study that takes a closer look into two techniques used for false positive mitigation issue: one side training and weight class adjustment. The techniques are used to train a neural network with zero false positives and are compared in order to find out which one give the highest true positive rate. Using a large dataset constructed over several years we show that by using these techniques a 90% true positive rate can be obtained while training for 0 false positives.

[1]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[2]  Jack W. Stokes,et al.  Large-scale malware classification using random projections and neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[3]  Konstantin Berlin,et al.  Deep neural network based malware detection using two dimensional binary program features , 2015, 2015 10th International Conference on Malicious and Unwanted Software (MALWARE).

[4]  Marek Krcál,et al.  Deep Convolutional Malware Classifiers Can Learn from Raw Executables and Labels Only , 2018, International Conference on Learning Representations.

[5]  Wenyi Huang,et al.  MtNet: A Multi-Task Neural Network for Dynamic Malware Classification , 2016, DIMVA.

[6]  Vineeth S. Bhaskara,et al.  Emulating malware authors for proactive protection using GANs over a distributed image visualization of the dynamic file behavior , 2018, ArXiv.

[7]  Li Chen,et al.  Deep Transfer Learning for Static Malware Classification , 2018, ArXiv.

[8]  B. S. Manjunath,et al.  Malware images: visualization and automatic classification , 2011, VizSec '11.

[9]  Mohit Sewak,et al.  An investigation of a deep learning based malware detection system , 2018, ARES.

[10]  Nicholas Kolokotronis,et al.  A Novel Malware Detection System Based on Machine Learning and Binary Visualization , 2019, 2019 IEEE International Conference on Communications Workshops (ICC Workshops).

[11]  Aziz Makandar,et al.  Malware Image Analysis and Classification using Support Vector Machine , 2015 .

[12]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[13]  Majd Latah When deep learning meets security , 2018, ArXiv.

[14]  Razvan Benchea,et al.  Optimized Zero False Positives Perceptron Training for Malware Detection , 2012, 2012 14th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing.

[15]  Santosh K. Mishra,et al.  De novo SVM classification of precursor microRNAs from genomic pseudo hairpins using global and intrinsic folding measures , 2007, Bioinform..

[16]  Shengli Liu,et al.  An enhancing framework for botnet detection using generative adversarial networks , 2018, 2018 International Conference on Artificial Intelligence and Big Data (ICAIBD).

[17]  D. Ruppert Robust Statistics: The Approach Based on Influence Functions , 1987 .