BotProfiler: Profiling Variability of Substrings in HTTP Requests to Detect Malware-Infected Hosts

Malware is constantly evolving, which makes it difficult to prevent it from infecting hosts. Many countermeasures against malware infection, such as generating network-based signatures or templates, have been investigated. Such templates are designed to introduce regular expressions to detect polymorphic attacks conducted by attackers. A potential problem with such templates, however, is that they sometimes falsely regard benign communications as malicious, resulting in false positives, due to an inherent aspect of regular expressions. Since the cost of responding to malware infection is quite high, the number of false positives should be kept to a minimum. Therefore, we propose a system to generate templates that cause fewer false positives than a conventional system. We focused on the key idea that malicious infrastructures, such as command and control, tend to be reused instead of created from scratch. The results of implementing our system and validating it using real traffic data indicate that it reduced false positives by up to two-thirds compared to the conventional system and even increased the detection rate of infected hosts.

[1]  Christopher Krügel,et al.  BareCloud: Bare-metal Analysis-based Evasive Malware Detection , 2014, USENIX Security Symposium.

[2]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[3]  Mitsuaki Akiyama,et al.  Design and Implementation of High Interaction Client Honeypot for Drive-by-Download Attacks , 2010, IEICE Trans. Commun..

[4]  Mitsuaki Akiyama,et al.  Client Honeypot Multiplication with High Performance and Precise Detection , 2015, IEICE Trans. Inf. Syst..

[5]  Geoff Hulten,et al.  Spamming botnets: signatures and characteristics , 2008, SIGCOMM '08.

[6]  Nick Feamster,et al.  Behavioral Clustering of HTTP-Based Malware and Signature Generation Using Malicious Network Traces , 2010, NSDI.

[7]  Roberto Perdisci,et al.  ExecScent: Mining for New C&C Domains in Live Networks with Adaptive Control Protocol Templates , 2013, USENIX Security Symposium.

[8]  Gabriel Maciá-Fernández,et al.  Survey and taxonomy of botnet research through life-cycle , 2013, CSUR.

[9]  Christopher Krügel,et al.  Extracting probable command and control signatures for detecting botnets , 2014, SAC.

[10]  Apostolis Zarras,et al.  Automated generation of models for fast and precise detection of HTTP-based malware , 2014, 2014 Twelfth Annual International Conference on Privacy, Security and Trust.

[11]  Takeshi Yagi,et al.  Controlling malware HTTP communications in dynamic analysis system using search engine , 2011, 2011 Third International Workshop on Cyberspace Safety and Security (CSS).