Efficient Dynamic Malware Analysis Based on Network Behavior Using Deep Learning

Malware authors or attackers always try to evade detection methods to accomplish their mission. Such detection methods are broadly divided into three types: static feature, host-behavior, and network-behavior based. Static feature-based methods are evaded using packing techniques. Host- behavior-based methods also can be evaded using some code injection methods, such as API hook and dynamic link library hook. This arms race regarding static feature-based and host-behavior- based methods increases the importance of network-behavior-based methods. The necessity of communication between infected hosts and attackers makes it difficult to evade network-behavior- based methods. The effectiveness of such methods depends on how we collect a variety of communications by using malware samples. However, analyzing all new malware samples for a long period is infeasible. Therefore, we propose a method for determining whether dynamic analysis should be suspended based on network behavior to collect malware communications efficiently and exhaustively. The key idea behind our proposed method is focused on two characteristics of malware communication: the change in the communication purpose and the common latent function. These characteristics of malware communications resemble those of natural language from the viewpoint of data structure, and sophisticated analysis methods have been proposed in the field of natural language processing. For this reason, we applied the recursive neural network, which has recently exhibited high classification performance, to our proposed method. In the evaluation with 29,562 malware samples, our proposed method reduced 67.1% of analysis time while keeping the coverage of collected URLs to 97.9% of the method that continues full analyses.

[1]  Christopher Krügel,et al.  FORECAST: skimming off the malware cream , 2011, ACSAC '11.

[2]  Guofei Gu,et al.  BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure-Independent Botnet Detection , 2008, USENIX Security Symposium.

[3]  Roberto Perdisci,et al.  ExecScent: Mining for New C&C Domains in Live Networks with Adaptive Control Protocol Templates , 2013, USENIX Security Symposium.

[4]  Takeshi Yagi,et al.  Controlling malware HTTP communications in dynamic analysis system using search engine , 2011, 2011 Third International Workshop on Cyberspace Safety and Security (CSS).

[5]  Andrew H. Sung,et al.  Static analyzer of vicious executables (SAVE) , 2004, 20th Annual Computer Security Applications Conference.

[6]  Ali A. Ghorbani,et al.  Automated malware classification based on network behavior , 2013, 2013 International Conference on Computing, Networking and Communications (ICNC).

[7]  Somesh Jha,et al.  OmniUnpack: Fast, Generic, and Safe Unpacking of Malware , 2007, Twenty-Third Annual Computer Security Applications Conference (ACSAC 2007).

[8]  Sattar Hashemi,et al.  Malware detection based on mining API calls , 2010, SAC '10.

[9]  Christopher Krügel,et al.  Improving the efficiency of dynamic malware analysis , 2010, SAC '10.

[10]  Aziz Mohaisen,et al.  Chatter: Classifying malware families using system event ordering , 2014, 2014 IEEE Conference on Communications and Network Security.

[11]  Davide Balzarotti,et al.  SoK: Deep Packer Inspection: A Longitudinal Study of the Complexity of Run-Time Packers , 2015, 2015 IEEE Symposium on Security and Privacy.

[12]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[13]  Tzi-cker Chiueh,et al.  Automatic Generation of String Signatures for Malware Detection , 2009, RAID.

[14]  Christopher Krügel,et al.  BareCloud: Bare-metal Analysis-based Evasive Malware Detection , 2014, USENIX Security Symposium.

[15]  Guofei Gu,et al.  BotSniffer: Detecting Botnet Command and Control Channels in Network Traffic , 2008, NDSS.

[16]  Mitsuaki Akiyama,et al.  BotProfiler: Profiling Variability of Substrings in HTTP Requests to Detect Malware-Infected Hosts , 2015, 2015 IEEE Trustcom/BigDataSE/ISPA.

[17]  James Newsom,et al.  Dynamic Taint Analysis for Automatic Detection, Analysis, and Signature Generation of Exploits on Commodity Software, Network and Distributed System Security Symposium Conference Proceedings : 2005 , 2005 .

[18]  Nick Feamster,et al.  Behavioral Clustering of HTTP-Based Malware and Signature Generation Using Malicious Network Traces , 2010, NSDI.

[19]  Somesh Jha,et al.  A Layered Architecture for Detecting Malicious Behaviors , 2008, RAID.

[20]  Kangbin Yim,et al.  Malware Obfuscation Techniques: A Brief Survey , 2010, 2010 International Conference on Broadband, Wireless Computing, Communication and Applications.

[21]  Wenke Lee,et al.  PolyUnpack: Automating the Hidden-Code Extraction of Unpack-Executing Malware , 2006, 2006 22nd Annual Computer Security Applications Conference (ACSAC'06).

[22]  Andrew Y. Ng,et al.  Parsing Natural Scenes and Natural Language with Recursive Neural Networks , 2011, ICML.

[23]  Christopher Krügel,et al.  Effective and Efficient Malware Detection at the End Host , 2009, USENIX Security Symposium.

[24]  Saumya Debray,et al.  A Generic Approach to Automatic Deobfuscation of Executable Code , 2015, 2015 IEEE Symposium on Security and Privacy.