Botnet Identification in DDoS Attacks With Multiple Emulation Dictionaries

In a Distributed Denial of Service (DDoS) attack, a network (botnet) of dispersed agents (bots) sends requests to a website to saturate its resources. Since the requests are sent by automata, the typical way to detect them is to look for some repetition pattern or commonalities between requests of the same user or from different users. For this reason, recent DDoS variants exploit communication layers that offer broader possibility in terms of admissible request patterns, such as, e.g., the application layer. In this case, the malicious agents can pick legitimate messages from an emulation dictionary, and each individual agent sends a relatively low number of admissible requests, so as to make its activity non suspicious. This problem has been recently addressed under the assumption that all the members of the botnet use the same emulation dictionary. This situation is an idealization of what occurs in practice, since different clusters of agents are typically sharing only part of a global emulation dictionary. The diversity among the emulation dictionaries across different clusters introduces significant complexity in the botnet identification challenge. This work tackles this issue and provides the following main contributions. We obtain an analytical characterization of the message innovation rate of the DDoS attack with multiple emulation dictionaries. Exploiting this result, we design a botnet identification algorithm equipped with a cluster expurgation rule, which, under appropriate technical conditions, is shown to provide exact classification of bots and normal users as the observation window size increases. Then, an experimental campaign over real network traces is conducted to assess the validity of the theoretical analysis, as well as to examine the effect of a number of non-ideal effects that are unavoidably observed in practical scenarios.