How to Automatically Identify the Homology of Different Malware

APT (Advanced Persistent Threat) attacks are developing rapidly and become severe threats nowadays. In this paper, homologous malware mean that they are developed and programmed by the same author or organization. To identify the homology of malware adopted by different APT attacks is conducive to constructing attack scenario, tracking attackers and even defending against new APT attacks. Currently, homology identification still relies on manual analysis and security experts' experience in the anti-malware industry. It is persuasive, but inefficient and time-consuming. In order to improve the effectiveness and efficiency, an automatic malware homology identification method is proposed in this paper. Six types of API (Application Programming Interface) call behaviors are defined according to programming habits, and extracted from the binary samples by static analysis. Based on the API call behaviors, the homologous degree of different malware is calculated using Jaccard similarity coefficient. Then the homology is identified by comparing the homologous degree with a threshold. Experimental evaluations on real-world samples show that this method achieves high accuracy rate and acceptable recall rate.

[1]  Levente Buttyán,et al.  The Cousins of Stuxnet: Duqu, Flame, and Gauss , 2012, Future Internet.

[2]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[3]  Curtis R. Cook,et al.  Programming style authorship analysis , 1989, CSC '89.

[4]  Eric Chien,et al.  W32.Duqu: The Precursor to the Next Stuxnet , 2012, LEET.

[5]  Arun Lakhotia,et al.  Malware Analysis and attribution using Genetic Information , 2012, 2012 7th International Conference on Malicious and Unwanted Software.

[6]  Md. Rafiqul Islam,et al.  Differentiating malware from cleanware using behavioural analysis , 2010, 2010 5th International Conference on Malicious and Unwanted Software.

[7]  Ali A. Ghorbani,et al.  Automated malware classification based on network behavior , 2013, 2013 International Conference on Computing, Networking and Communications (ICNC).

[8]  Arun Lakhotia,et al.  Identifying Shared Software Components to Support Malware Forensics , 2014, DIMVA.

[9]  Lynn Margaret Batten,et al.  Function length as a tool for malware classification , 2008, 2008 3rd International Conference on Malicious and Unwanted Software (MALWARE).

[10]  Levente Buttyán,et al.  Duqu: A Stuxnet-like malware found in the wild , 2011 .

[11]  Rob Sloan,et al.  Advanced Persistent Threat , 2014 .

[12]  B. S. Manjunath,et al.  Malware images: visualization and automatic classification , 2011, VizSec '11.

[13]  Guanhua Yan,et al.  Discriminant malware distance learning on structural information for automated malware classification , 2013, SIGMETRICS.