Experimental Comparison of Machine Learning Models in Malware Packing Detection

Recently , malware is widely distributed by combining recent technologies such as packing, encoding and obfuscation to bypass anti-virus software. These kinds of technologies allow malware to survive longer, infect various computers and devices for longer periods of time, create a number of mutated malware, and make experts spend longer to analyze malware. Packers disrupt the reverse engineering process, making it difficult for security researchers to analyze new or unknown malware. Thus, we need to analyze as many malware as possible by first detecting the packed malware and analyzing not-packed malware, and then unpack the packed malware. Previously, the packing detection methods were based on mainly signature and entropy detection. However, these methods have increased the undetected rate with the appearance of custom packers. Due to these problems, there have been many research efforts on machine learning-based malware packing detection and classification. In this paper, we present an extensive experimental comparison of these machine learning-based algorithms. In particular, we extract a total of 13 important features and considers eight machine learning algorithms to detect the packing of malware. Experimental results show that we can also detect well malware packed by custom packers which did not studied in previous studies.

[1]  Moshe Kam,et al.  Run-time classification of malicious processes using system call analysis , 2015, 2015 10th International Conference on Malicious and Unwanted Software (MALWARE).

[2]  Mi-Jung Choi,et al.  All-in-One Framework for Detection, Unpacking, and Verification for Malware Analysis , 2019, Secur. Commun. Networks.

[3]  Christopher Krügel,et al.  Polymorphic Worm Detection Using Structural Information of Executables , 2005, RAID.

[4]  Mangal Sain,et al.  Survey on malware evasion techniques: State of the art and challenges , 2012, 2012 14th International Conference on Advanced Communication Technology (ICACT).

[5]  Robert Lyda,et al.  Using Entropy Analysis to Find Encrypted and Packed Malware , 2007, IEEE Security & Privacy.

[6]  Carsten Willems,et al.  Learning and Classification of Malware Behavior , 2008, DIMVA.

[7]  Davide Balzarotti,et al.  SoK: Deep Packer Inspection: A Longitudinal Study of the Complexity of Run-Time Packers , 2015, 2015 IEEE Symposium on Security and Privacy.

[8]  Heejo Lee,et al.  Packer Detection for Multi-Layer Executables Using Entropy Analysis , 2017, Entropy.

[9]  Babak Rahbarinia,et al.  Exploring the Long Tail of (Malicious) Software Downloads , 2017, 2017 47th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).

[10]  Yang-seo Choi,et al.  PE File Header Analysis-Based Packed PE File Detection Technique (PHAD) , 2008, International Symposium on Computer Science and its Applications.

[11]  Carsten Willems,et al.  Automatic analysis of malware behavior using machine learning , 2011, J. Comput. Secur..

[12]  Axel Legay,et al.  Effective, efficient, and robust packing detection and classification , 2019, Comput. Secur..

[13]  Eric Filiol,et al.  Malware Pattern Scanning Schemes Secure Against Black-box Analysis , 2006, Journal in Computer Virology.

[14]  Spiros Mancoridis,et al.  Behavioral Malware Classification using Convolutional Recurrent Neural Networks , 2018, 2018 13th International Conference on Malicious and Unwanted Software (MALWARE).

[15]  Wenke Lee,et al.  Classification of packed executables for accurate computer virus detection , 2008, Pattern Recognit. Lett..