Malware Classification Using Machine Learning Algorithms and Tools

Malware classification is the process of categorizing the families of malware on the basis of their signatures. This work focuses on classifying the emerging malwares on the basis of comparable features of similar malwares. This paper proposes a novel framework that categorizes malware samples into their families and can identify new malware samples for analysis. For this six diverse classification techniques of machine learning are used. To get more comparative and thus accurate classification results, analysis is done using two different tools, named as Knime and Orange. The work proposed can help in identifying and thus cleaning new malwares and classifying malware into their families. The correctness of family classification of malwares is investigated in terms of confusion matrix, accuracy and Cohen's Kappa. After evaluation it is analyzed that Random Forest gives the highest accuracy.

[1]  Nataasha Raul,et al.  Malware Detection Module using Machine Learning Algorithms to Assist in Centralized Security in Enterprise Networks , 2012, ArXiv.

[2]  Nathan S. Netanyahu,et al.  DeepSign: Deep learning for automatic malware signature generation and classification , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).

[3]  Ravijeet Singh Chauhan Predicting the Value of a Target Attribute Using Data Mining , 2013 .

[4]  Yasir Saleem,et al.  A Hybrid Approach for Malware Family Classification , 2017, ATIS.

[5]  Divya Bansal,et al.  Malware Analysis and Classification: A Survey , 2014 .

[6]  Qihui Wu,et al.  A survey of machine learning for big data processing , 2016, EURASIP Journal on Advances in Signal Processing.

[7]  Bo Yu,et al.  Automatic malware classification and new malware detection using machine learning , 2017, Frontiers of Information Technology & Electronic Engineering.

[8]  S. Archana,et al.  Survey of Classification Techniques in Data Mining , 2014 .

[9]  Emmanuel Masabo,et al.  Big Data: Deep Learning for Detecting Malware , 2018, 2018 IEEE/ACM Symposium on Software Engineering in Africa (SEiA).

[10]  Dragos Gavrilut,et al.  Malware detection using machine learning , 2009, 2009 International Multiconference on Computer Science and Information Technology.

[11]  Kateryna Chumachenko,et al.  Machine Learning Methods for Malware Detection and Classification , 2017 .

[12]  Amit R. Wasukar Artificial Neural Network - An Important Asset for Future Computing , 2014 .

[13]  Vinod Yegneswaran,et al.  A comparative assessment of malware classification using binary texture analysis and dynamic analysis , 2011, AISec '11.

[14]  Carsten Willems,et al.  Learning and Classification of Malware Behavior , 2008, DIMVA.

[15]  Karthik Raman,et al.  Selecting Features to Classify Malware , 2012 .