Understanding Neuro-Fuzzy on a class of multinomial malware detection problems

Malware classification has become an important task in protection of privacy and sensitive information from being stolen or modified. A number of malware categories and families emerged over last decade targeting Microsoft Windows since it is the most attractive platform for virus developers. Software for this OS is provided in a format of Portable Executable (PE) files. Majority of commercial anti-virus solutions use signature-based detection, where malware pattern is described by means of unique crisp identifier of corresponding PE file content. Neuro-Fuzzy is one of the prospective Hybrid Intelligence methods suitable for malware detection. Despite the fact that Neuro-Fuzzy is being utilized successfully for binary malware classification, it was not investigated in such complex cases as multinomial classification of malware categories and families. The advantage of this method is ability to produce generalized fuzzy rules model applicable for real-world application. For this study we created a novel large-scale malware dataset that include variety of malware samples. Moreover, reports from VirusTotal and PEframe were collected to PE header features and species naming used by major anti-virus vendors. Finally, we applied tuned Neuro-Fuzzy for handling multinomial problems. This paper contributes as a stepping stone for future analysis of aspects Neuro-Fuzzy methods for real-world malware classification.

[1]  Jianping Yin,et al.  Malicious Codes Detection Based on Ensemble Learning , 2007, ATC.

[2]  Carsten Willems,et al.  Learning and Classification of Malware Behavior , 2008, DIMVA.

[3]  Marie Cottrell,et al.  Neural Networks for Complex Data , 2012, KI - Künstliche Intelligenz.

[4]  Zane Markel,et al.  Building a machine learning classifier for malware detection , 2014, 2014 Second Workshop on Anti-malware Testing Research (WATeR).

[5]  Jack W. Stokes,et al.  Large-scale malware classification using random projections and neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[6]  L. M. Korotenko,et al.  The History of Windows , 2013 .

[7]  Yi Lu Murphey,et al.  Multi-class pattern classification using neural networks , 2007, Pattern Recognit..

[8]  Mario Köppen,et al.  Static Signature Verification Employing a Kosko-Neuro-fuzzy Approach , 2002, AFSS.

[9]  Francisco Herrera,et al.  Interpretability of linguistic fuzzy rule-based systems: An overview of interpretability measures , 2011, Inf. Sci..

[10]  G. Castellano,et al.  Discovering interpretable classification rules from neural processed data , 2002 .

[11]  Srinivas Mukkamala,et al.  Kernel machines for malware classification and similarity analysis , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[12]  Katrin Franke,et al.  Automated generation of fuzzy rules from large-scale network traffic analysis in digital forensics investigations , 2015, 2015 7th International Conference of Soft Computing and Pattern Recognition (SoCPaR).

[13]  J. M. Alonso,et al.  Analyzing interpretability of fuzzy rule-based systems by means of fuzzy inference-grams , 2011 .

[14]  Lars Strande Grini Feature Extraction and Static Analysis for Large-Scale Detection of Malware Types and Families , 2016 .

[15]  Héctor Pomares,et al.  TaSe, a Taylor series-based fuzzy system model that combines interpretability and accuracy , 2005, Fuzzy Sets Syst..

[16]  Vipin Kumar,et al.  Chapman & Hall/CRC Data Mining and Knowledge Discovery Series , 2008 .

[17]  Sandrine Dudoit,et al.  Bagging to Improve the Accuracy of A Clustering Procedure , 2003, Bioinform..

[18]  Katrin Franke,et al.  Towards Improvement of Multinomial Classification Accuracy of Neuro-Fuzzy for Digital Forensics Applications , 2016, HIS.

[19]  Muhammad Zubair Shafiq,et al.  Improving accuracy of immune-inspired malware detectors by using intelligent features , 2008, GECCO '08.

[20]  Katrin Franke,et al.  A New Method for an Optimal SOM Size Determination in Neuro-Fuzzy for the Digital Forensics Applications , 2015, IWANN.

[21]  Marco Wiering,et al.  2011 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN) , 2011, IJCNN 2011.