HAS-Analyzer: Detecting HTTP-based C&C based on the Analysis of HTTP Activity Sets

Because HTTP-related ports are allowed through firewalls, they are an obvious point for launching cyber attacks. In particular, malware uses HTTP protocols to communicate with their master servers. We call this an HTTP-based command and control (C&C) server. Most previous studies concentrated on the behavioral pattern of C&Cs. However, these approaches need a well-defined white list to reduce the false positive rate because there are many benign applications, such as automatic update checks and web refreshes, that have a periodic access pattern. In this paper, we focus on finding new discriminative features of HTTP-based C&Cs by analyzing HTTP activity sets. First, a C&C shows a few connections at a time (low density). Second, the content of a request or a response is changed frequently among consecutive C&Cs (high content variability). Based on these two features, we propose a novel C&C analysis mechanism that detects the HTTP-based C&C. The HAS-Analyzer can classify the HTTP-based C&C with an accuracy of more than 96% and a false positive rate of 1.3% without using any white list.