Three-phase behavior-based detection and classification of known and unknown malware

To improve both accuracy and efficiency in detecting known and even unknown malware, we propose a three-phase behavior-based malware detection and classification approach, with a faster detector in the first phase to filter most samples, a slower detector in the second phase to observe remaining ambiguous samples, and then a classifier in the third phase to recognize their malware type. The faster detector executes programs in a sandbox to extract representative behaviors fed into a trained artificial neural network to evaluate their maliciousness, whereas the slower detector extracts and matches the LCSs of system call sequences fed into a trained Bayesian model to calculate their maliciousness. In the third phase, we define malware behavior vectors and calculate the cosine similarity to classify the malware. The experimental results show that the hybrid two-phase detection scheme outperforms the one-phase schemes and achieves 3.6% in false negative and 6.8% in false positive. The third-phase classifier also distinguishes the known-type malware with an accuracy of 85.8%. Copyright © 2015 John Wiley & Sons, Ltd.

[1]  Yuan-Cheng Lai,et al.  Identifying android malicious repackaged applications by thread-grained system call sequences , 2013, Comput. Secur..

[2]  Ehud Gudes,et al.  A Method for Detecting Unknown Malicious Executables , 2011, 2011IEEE 10th International Conference on Trust, Security and Privacy in Computing and Communications.

[3]  Stephanie Forrest,et al.  A sense of self for Unix processes , 1996, Proceedings 1996 IEEE Symposium on Security and Privacy.

[4]  Christopher Krügel,et al.  Anomalous system call detection , 2006, TSEC.

[5]  Muddassar Farooq,et al.  IMAD: in-execution malware analysis and detection , 2009, GECCO.

[6]  Mohammed J. Zaki,et al.  SPADE: An Efficient Algorithm for Mining Frequent Sequences , 2004, Machine Learning.

[7]  Wu Liu,et al.  Behavior-Based Malware Analysis and Detection , 2011, 2011 First International Workshop on Complexity and Data Mining.

[8]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[9]  Barak A. Pearlmutter,et al.  Detecting intrusions using system calls: alternative data models , 1999, Proceedings of the 1999 IEEE Symposium on Security and Privacy (Cat. No.99CB36344).

[10]  Christopher Krügel,et al.  Exploring Multiple Execution Paths for Malware Analysis , 2007, 2007 IEEE Symposium on Security and Privacy (SP '07).

[11]  R. Sekar,et al.  A fast automaton-based method for detecting anomalous program behaviors , 2001, Proceedings 2001 IEEE Symposium on Security and Privacy. S&P 2001.

[12]  Yuan-Cheng Lai,et al.  Automatic Analysis and Classification of Obfuscated Bot Binaries , 2014, Int. J. Netw. Secur..

[13]  Ming Xu,et al.  Malware obfuscation measuring via evolutionary similarity , 2009, 2009 First International Conference on Future Information Networks.

[14]  Binod Vaidya,et al.  Anomaly intrusion detection for system call using the soundex algorithm and neural networks , 2005, 10th IEEE Symposium on Computers and Communications (ISCC'05).