Classification of Metamorphic Virus Using N-Grams Signatures

Metamorphic virus has a capability to change, translate, and rewrite its own code once infected the system to bypass detection. The computer system then can be seriously damage by this undetected metamorphic virus. Due to this, it is very vital to design a metamorphic virus classification model that can detect this virus. This paper focused on detection of metamorphic virus using Term Frequency Inverse Document Frequency (TF-IDF) technique. This research was conducted using Second Generation virus dataset. The first step is the classification model to cluster the metamorphic virus using TF-IDF technique. Then, the virus cluster is evaluated using Naive Bayes algorithm in terms of accuracy using performance metric. The types of virus classes and features are extracted from bi-gram assembly language. The result shows that the proposed model was able to classify metamorphic virus using TF-IDF with optimal number of virus class with average accuracy of 94.2%.

[1]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[2]  Kelly Hughes,et al.  Detecting metamorphic malware by using behavior-based aggregated signature , 2013, World Congress on Internet Security (WorldCIS-2013).

[3]  Evgenios Konstantinou,et al.  Metamorphic Virus: Analysis and Detection , 2008 .

[4]  Majid Vafaei Jahan,et al.  Metamorphic virus detection based on Bayesian network , 2014, 2014 International Congress on Technology, Communication and Knowledge (ICTCK).

[5]  Mark Stamp,et al.  Hunting for undetectable metamorphic viruses , 2011, Journal in Computer Virology.

[6]  P. Vinod,et al.  Metamorphic virus detection using feature selection techniques , 2014, 2014 International Conference on Computer and Communication Technology (ICCCT).