Analyzing the real-world applicability of DGA classifiers

Separating benign domains from domains generated by DGAs with the help of a binary classifier is a well-studied problem for which promising performance results have been published. The corresponding multiclass task of determining the exact DGA that generated a domain enabling targeted remediation measures is less well studied. Selecting the most promising classifier for these tasks in practice raises a number of questions that have not been addressed in prior work so far. These include the questions on which traffic to train in which network and when, just as well as how to assess robustness against adversarial attacks. Moreover, it is unclear which features lead a classifier to a decision and whether the classifiers are real-time capable. In this paper, we address these issues and thus contribute to bringing DGA detection classifiers closer to practical use. In this context, we propose one novel classifier based on residual neural networks for each of the two tasks and extensively evaluate them as well as previously proposed classifiers in a unified setting. We not only evaluate their classification performance but also compare them with respect to explainability, robustness, and training and classification speed. Finally, we show that our newly proposed binary classifier generalizes well to other networks, is time-robust, and able to identify previously unknown DGAs.

[1]  Leyla Bilge,et al.  Exposure: A Passive DNS Analysis Service to Detect and Report Malicious Domains , 2014, TSEC.

[2]  Ricardo J. Rodríguez,et al.  Detection of Intrusions and Malware, and Vulnerability Assessment , 2016, Lecture Notes in Computer Science.

[3]  Asaf Shabtai,et al.  MaskDGA: A Black-box Evasion Technique Against DGA Classifiers and Adversarial Defenses , 2019, ArXiv.

[4]  Roberto Perdisci,et al.  From Throw-Away Traffic to Bots: Detecting the Rise of DGA-Based Malware , 2012, USENIX Security Symposium.

[5]  Hyrum S. Anderson,et al.  DeepDGA: Adversarially-Tuned Domain Generation and Detection , 2016, AISec@CCS.

[6]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[7]  John Tran,et al.  cuDNN: Efficient Primitives for Deep Learning , 2014, ArXiv.

[8]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Martine De Cock,et al.  Character Level based Detection of DGA Domain Names , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).

[10]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[11]  Wouter Joosen,et al.  Detection of algorithmically generated domain names used by botnets: a dual arms race , 2019, SAC.

[12]  Konstantin Berlin,et al.  eXpose: A Character-Level Convolutional Neural Network with Embeddings For Detecting Malicious URLs, File Paths and Registry Keys , 2017, ArXiv.

[13]  Bin Yu,et al.  CharBot: A Simple and Effective Method for Evading DGA Classifiers , 2019, IEEE Access.

[14]  Paul V. Mockapetris,et al.  Domain names - implementation and specification , 1987, RFC.

[15]  Ulrike Meyer,et al.  FANCI : Feature-based Automated NXDomain Classification and Intelligence , 2018, USENIX Security Symposium.

[16]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[17]  Wouter Joosen,et al.  Tranco: A Research-Oriented Top Sites Ranking Hardened Against Manipulation , 2018, NDSS.

[18]  Martin Rehák,et al.  Detecting DGA malware using NetFlow , 2015, 2015 IFIP/IEEE International Symposium on Integrated Network Management (IM).

[19]  Johannes Bader,et al.  A Comprehensive Measurement Study of Domain Generating Malware , 2016, USENIX Security Symposium.

[20]  Stefano Zanero,et al.  Phoenix: DGA-Based Botnet Tracking and Intelligence , 2014, DIMVA.

[21]  Hai Anh Tran,et al.  A LSTM based framework for handling multiclass imbalance in DGA botnet detection , 2018, Neurocomputing.

[22]  Serge J. Belongie,et al.  Residual Networks Behave Like Ensembles of Relatively Shallow Networks , 2016, NIPS.

[23]  Yong Shi,et al.  Malicious Domain Name Detection Based on Extreme Machine Learning , 2017, Neural Processing Letters.

[24]  Zhi-Hua Zhou,et al.  Ieee Transactions on Knowledge and Data Engineering 1 Training Cost-sensitive Neural Networks with Methods Addressing the Class Imbalance Problem , 2022 .

[25]  Sandeep Yadav,et al.  Winning with DNS Failures: Strategies for Faster Botnet Detection , 2011, SecureComm.

[26]  Hyrum S. Anderson,et al.  Predicting Domain Generation Algorithms with Long Short-Term Memory Networks , 2016, ArXiv.

[27]  Asaf Shabtai,et al.  MaskDGA: An Evasion Attack Against DGA Classifiers and Adversarial Defenses , 2020, IEEE Access.

[28]  Martine De Cock,et al.  An Evaluation of DGA Classifiers , 2018, 2018 IEEE International Conference on Big Data (Big Data).