Structural classification and similarity measurement of malware

This paper proposes a new lightweight method that utilizes the growing hierarchical self-organizing map (GHSOM) for malware detection and structural classification. It also shows a new method for measuring the structural similarity between classes. A dynamic link library (DLL) file is an executable file used in the Windows operating system that allows applications to share codes and other resources to perform particular tasks. In this paper, we classify different malware by the data mining of the DLL files used by the malware. Since the malware families are evolving quickly, they present many new problems, such as how to link them to other existing malware families. The experiment shows that our GHSOM-based structural classification can solve these issues and generate a malware classification tree according to the similarity of malware families. © 2014 Institute of Electrical Engineers of Japan. Published by John Wiley & Sons, Inc.

[1]  Paul Barford,et al.  An empirical study of malware evolution , 2009, 2009 First International Communication Systems and Networks and Workshops.

[2]  Salvatore J. Stolfo,et al.  Data mining methods for detection of new malicious executables , 2001, Proceedings 2001 IEEE Symposium on Security and Privacy. S&P 2001.

[3]  Yanfang Ye,et al.  IMDS: intelligent malware detection system , 2007, KDD '07.

[4]  Yanfang Ye,et al.  CIMDS: Adapting Postprocessing Techniques of Associative Classification for Malware Detection , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[5]  Yuval Shahar,et al.  Application of Artificial Neural Networks Techniques to Computer Worm Detection , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[6]  Andrew Walenstein,et al.  The Software Similarity Problem in Malware Analysis , 2006, Duplication, Redundancy, and Similarity in Software.

[7]  Frédéric Raynal,et al.  New threats and attacks on the World Wide Web , 2006, IEEE Security & Privacy.

[8]  Brian N. Bershad,et al.  Execution characteristics of desktop applications on Windows NT , 1998, ISCA.

[9]  Koji Nakao,et al.  A Novel Malware Clustering Method Using Frequency of Function Call Traces in Parallel Threads , 2011, IEICE Trans. Inf. Syst..

[10]  Andreas Rauber,et al.  The growing hierarchical self-organizing map: exploratory analysis of high-dimensional data , 2002, IEEE Trans. Neural Networks.

[11]  InSeon Yoo,et al.  Visualizing windows executable viruses using self-organizing maps , 2004, VizSEC/DMSEC '04.

[12]  Yong Chen,et al.  Automatic malware categorization using cluster ensemble , 2010, KDD.

[13]  Hao Wang,et al.  NetSpy: Automatic Generation of Spyware Signatures for NIDS , 2006, 2006 22nd Annual Computer Security Applications Conference (ACSAC'06).

[14]  Yoichi Muraoka,et al.  Towards Efficient Analysis for Malware in the Wild , 2011, 2011 IEEE International Conference on Communications (ICC).

[15]  Carsten Willems,et al.  Learning and Classification of Malware Behavior , 2008, DIMVA.

[16]  Marcus A. Maloof,et al.  Learning to Detect and Classify Malicious Executables in the Wild , 2006, J. Mach. Learn. Res..

[17]  Heng Yin,et al.  Panorama: capturing system-wide information flow for malware detection and analysis , 2007, CCS '07.