A Real-Time PE-Malware Detection System Based on CHI-Square Test and PE-File Features

Constructing an efficient malware detection system requires taking into consideration two important aspects, which are the accuracy and the detection time. However, finding an appropriate balance between these two characteristics remains at this time a very challenging problem. In this paper, we present a real-time PE (Portable Executable) malware detection system, which is based on the analysis of the information stored in the PE-Optional Header fields (PEF). Our system used a combination of the Chi-square (KHI2) score and the Phi (ϕ) coefficient as feature selection method. We have evaluated our system using Rotation Forest classifier implemented in WEKA and we reached more than 97% of accuracy. Our system is able to categorize a file in 0.077 seconds, which makes it adequate for real-time detection of malware.

[1]  Eric R. Ziegel Statistical tables and formulae , 1989 .

[2]  Yuval Elovici,et al.  Detection of malicious code by applying machine learning classifiers on static features: A state-of-the-art survey , 2009, Inf. Secur. Tech. Rep..

[3]  Juan José Rodríguez Diez,et al.  Rotation Forest: A New Classifier Ensemble Method , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[5]  Paolo Fornasini The Chi Square Test , 2008 .

[6]  Ashkan Sami,et al.  Using feature generation from API calls for malware detection , 2014 .

[7]  D. Farrington,et al.  Relative improvement over chance (RIOC) and phi as measures of predictive efficiency and strength of association in 2×2 tables , 1989 .

[8]  Ali Hamzeh,et al.  A survey on heuristic malware detection techniques , 2013, The 5th Conference on Information and Knowledge Technology.

[9]  Технология Springer Science+Business Media , 2013 .

[10]  John Aycock,et al.  Computer Viruses and Malware , 2006, Advances in Information Security.

[11]  Salvatore J. Stolfo,et al.  Data mining methods for detection of new malicious executables , 2001, Proceedings 2001 IEEE Symposium on Security and Privacy. S&P 2001.

[12]  Jianmin Pang,et al.  Using API Sequence and Bayes Algorithm to Detect Suspicious Behavior , 2009, 2009 International Conference on Communication Software and Networks.

[13]  Yanfang Ye,et al.  CIMDS: Adapting Postprocessing Techniques of Associative Classification for Malware Detection , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).