Finding Domain-Generation Algorithms by Looking at Length Distribution

In order to detect malware that uses domain fluxing to circumvent blacklisting, it is useful to be able to discover new domain-generation algorithms (DGAs) that are being used to generate algorithmically-generated domains (AGDs). This paper presents a procedure for discovering DGAs from Domain Name Service (DNS) query data. It works by identifying client IP addresses with an unusual distribution of second-level string lengths in the domain names that they query. Running this fairly simple procedure on 5 days' data from a large enterprise network uncovered 19 different DGAs, nine of which have not been identified as previously-known. Samples and statistical information about the DGA domains are given.