Accuracy Improved Malware Detection Method using Snort Sub-signatures and Machine Learning Techniques

Malware is a major computer security concern as many computing systems are connected to the Internet. The number of malware has increased over the years and a new malware has emerged daily. These new malware variants are capable of evading conventional system detection through obfuscations. One of the promising methods used to detect malware is machine learning (ML) techniques. This work presents a static malware detection system using n-gram and machine learning techniques. Successively, the known malware sub-signatures are developed to reduce large feature search spaces. That are generated due to n-gram feature extraction methods. Consequently, the feature space directly affects the performance and the detection accuracy of malware ML classifiers. Analysis of multiple feature selection methods to minimize the number of features and analysis of multiple ML classifiers are also developed to improve the malware detection accuracy. The results have shown that analyzing n-gram with Snort sub-signature features using machine learning may produce a good malware detection accuracy of more than 99.78%, minimized processing time of the optimum SVM classifier down to 5 sec. for all data set and zero FPR when 4gram features are applied for most of the verified ML classifiers.

[1]  Sami Hasan Performance-Aware Architectures for Parallel 4D Color fMRI Filtering Algorithm: A Complete Performance Indices Package , 2016, IEEE Transactions on Parallel and Distributed Systems.

[2]  Rohini K. Srihari,et al.  Feature selection for text categorization on imbalanced data , 2004, SKDD.

[3]  Xin Xu,et al.  An Adaptive Network Intrusion Detection Method Based on PCA and Support Vector Machines , 2005, ADMA.

[4]  Ferat Sahin,et al.  A survey on feature selection methods , 2014, Comput. Electr. Eng..

[5]  Bo Li,et al.  Automated poisoning attacks and defenses in malware detection systems: An adversarial machine learning approach , 2017, Comput. Secur..

[6]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[7]  Mourad Debbabi,et al.  Network malware classification comparison using DPI and flow packet headers , 2015, Journal of Computer Virology and Hacking Techniques.

[8]  Igor Santos,et al.  Opcode sequences as representation of executables for data-mining-based unknown malware detection , 2013, Inf. Sci..

[9]  Kieran McLaughlin,et al.  SVM Training Phase Reduction Using Dataset Feature Filtering for Malware Detection , 2013, IEEE Transactions on Information Forensics and Security.

[10]  Sulaiman Mohd Nor,et al.  FEATURE SELECTION AND MACHINE LEARNING CLASSIFICATION FOR MALWARE DETECTION , 2015 .

[11]  Yuval Elovici,et al.  Detection of malicious code by applying machine learning classifiers on static features: A state-of-the-art survey , 2009, Inf. Secur. Tech. Rep..

[12]  Vijay Laxmi,et al.  REFORM: Relevant Features for Malware Analysis , 2012, 2012 26th International Conference on Advanced Information Networking and Applications Workshops.

[13]  Jianping Yin,et al.  Malicious Codes Detection Based on Ensemble Learning , 2007, ATC.

[14]  Ronghua Tian,et al.  An integrated malware detection and classification system , 2011 .

[15]  J.D.S. da Silva,et al.  A neural network application for attack detection in computer networks , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[16]  Marcus A. Maloof,et al.  Learning to detect malicious executables in the wild , 2004, KDD.

[17]  Marcus A. Maloof,et al.  Learning to Detect and Classify Malicious Executables in the Wild , 2006, J. Mach. Learn. Res..

[18]  Arun K. Pujari,et al.  N-gram analysis for computer virus detection , 2006, Journal in Computer Virology.

[19]  Igor Santos,et al.  Semi-supervised Learning for Unknown Malware Detection , 2011, DCAI.

[20]  Xiangliang Zhang,et al.  Constructing attribute weights from computer audit data for effective intrusion detection , 2009, J. Syst. Softw..

[21]  Nelson Ochieng,et al.  Detecting scanning computer worms using machine learning and darkspace network traffic , 2017 .

[22]  Arun Lakhotia,et al.  Malware and Machine Learning , 2015, Intelligent Methods for Cyber Warfare.

[23]  Yuval Elovici,et al.  Unknown malcode detection and the imbalance problem , 2009, Journal in Computer Virology.

[24]  Pele Li,et al.  A survey of internet worm detection and containment , 2008, IEEE Communications Surveys & Tutorials.

[25]  A. Karegowda,et al.  COMPARATIVE STUDY OF ATTRIBUTE SELECTION USING GAIN RATIO AND CORRELATION BASED FEATURE SELECTION , 2010 .

[26]  Anupama Sharma,et al.  Capturing the interplay between malware and anti-malware in a computer network , 2014, Appl. Math. Comput..

[27]  Qiguang Miao,et al.  Malware detection using bilayer behavior abstraction and improved one-class support vector machines , 2015, International Journal of Information Security.

[28]  GlezerChanan,et al.  Detection of malicious code by applying machine learning classifiers on static features , 2009 .

[29]  Yuval Elovici,et al.  Detecting unknown malicious code by applying classification techniques on OpCode patterns , 2012, Security Informatics.

[30]  Sami Hasan Rapidly-Fabricated Architectures of Parallel Multidimension Algorithms , 2017 .

[31]  Sami Hasan FPGA implementations for parallel multidimensional filtering algorithms , 2013 .

[32]  Lília de Sá Silva,et al.  Detecting attack signatures in the real network traffic with ANNIDA , 2008, Expert Syst. Appl..

[33]  Ismahani Ismail,et al.  Metamorphic malware detection based on support vector machine classification of malware sub-signatures , 2016 .

[34]  Ohm Sornil,et al.  Classification of malware families based on N-grams sequential pattern features , 2013, 2013 IEEE 8th Conference on Industrial Electronics and Applications (ICIEA).

[35]  Sami Hasan Performance-vetted 3-D MAC processors for parallel volumetric convolution algorithm: A 256×256×20 MRI filtering case study , 2016, 2016 Al-Sadeq International Conference on Multidisciplinary in IT and Communication Science and Applications (AIC-MITCSA).

[36]  Tayssir Touili,et al.  Efficient Malware Detection Using Model-Checking , 2012, FM.

[37]  Yuval Elovici,et al.  Unknown malcode detection via text categorization and the imbalance problem , 2008, 2008 IEEE International Conference on Intelligence and Security Informatics.

[38]  Sulaiman Mohd Nor,et al.  Incorporating known malware signatures to classify new malware variants in network traffic , 2015, Int. J. Netw. Manag..