Malware Variants Detection Using Density Based Spatial Clustering with Global Opcode Matrix

Over the past decades, the amount of malware has rapidly increased. Malware detection becomes one of most mission critical security problems as its threats spread from personal computers to cloud server. Some researchers have proposed machine learning methods which can detect malware variants by searching the similarities between malware and its variants. However, the large search space causes large time cost and memory space occupation. To reduce the search space while retaining the accuracy, we firstly propose to convert malware into global opcode matrix which is based on 2-tuple opcodes, and then cluster the opcode matrixes to patterns. We can easily recognize the malware variants by searching the similarities with the patterns. The experiments demonstrate that our approach is more efficient than the state-of-art approaches in time cost, memory space occupation and accuracy.

[1]  Yuval Elovici,et al.  Detecting unknown malicious code by applying classification techniques on OpCode patterns , 2012, Security Informatics.

[2]  Zheng Qin,et al.  IRMD: Malware Variant Detection Using Opcode Image Recognition , 2016, 2016 IEEE 22nd International Conference on Parallel and Distributed Systems (ICPADS).

[3]  Heejo Lee,et al.  Detecting metamorphic malwares using code graphs , 2010, SAC '10.

[4]  Jian Xu,et al.  Detecting malware variants via function-call graph similarity , 2010, 2010 5th International Conference on Malicious and Unwanted Software.

[5]  Igor Santos,et al.  Opcode sequences as representation of executables for data-mining-based unknown malware detection , 2013, Inf. Sci..

[6]  Yuval Elovici,et al.  Unknown Malcode Detection Using OPCODE Representation , 2008, EuroISI.

[7]  Igor Santos,et al.  OPEM: A Static-Dynamic Approach for Machine-Learning-Based Malware Detection , 2012, CISIS/ICEUTE/SOCO Special Sessions.

[8]  Zheng Qin,et al.  Malware Variant Detection Using Opcode Image Recognition with Small Training Sets , 2016, 2016 25th International Conference on Computer Communication and Networks (ICCCN).