CharBot: A Simple and Effective Method for Evading DGA Classifiers

Domain generation algorithms (DGAs) are commonly leveraged by malware to create lists of domain names, which can be used for command and control (C&C) purposes. Approaches based on machine learning have recently been developed to automatically detect generated domain names in real-time. In this paper, we present a novel DGA called CharBot, which is capable of producing large numbers of unregistered domain names that are not detected by state-of-the-art classifiers for real-time detection of the DGAs, including the recently published methods FANCI (a random forest based on human-engineered features) and LSTM.MI (a deep learning approach). The CharBot is very simple, effective, and requires no knowledge of the targeted DGA classifiers. We show that retraining the classifiers on CharBot samples is not a viable defense strategy. We believe these findings show that DGA classifiers are inherently vulnerable to adversarial attacks if they rely only on the domain name string to make a decision. Designing a robust DGA classifier may, therefore, necessitate the use of additional information besides the domain name alone. To the best of our knowledge, the CharBot is the simplest and most efficient black-box adversarial attack against DGA classifiers proposed to date.

[1]  Johannes Bader,et al.  A Comprehensive Measurement Study of Domain Generating Malware , 2016, USENIX Security Symposium.

[2]  Stefano Zanero,et al.  Phoenix: DGA-Based Botnet Tracking and Intelligence , 2014, DIMVA.

[3]  Dejing Dou,et al.  HotFlip: White-Box Adversarial Examples for Text Classification , 2017, ACL.

[4]  Martine De Cock,et al.  An Evaluation of DGA Classifiers , 2018, 2018 IEEE International Conference on Big Data (Big Data).

[5]  Seyed-Mohsen Moosavi-Dezfooli,et al.  DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Hyrum S. Anderson,et al.  Predicting Domain Generation Algorithms with Long Short-Term Memory Networks , 2016, ArXiv.

[7]  Ying Tan,et al.  Generating Adversarial Malware Examples for Black-Box Attacks Based on GAN , 2017, DMBD.

[8]  Alejandro Correa Bahnsen,et al.  DeepPhish : Simulating Malicious AI , 2018 .

[9]  Martine De Cock,et al.  Algorithmically Generated Domain Detection and Malware Family Classification , 2018, SSCC.

[10]  Tanmoy Chakraborty,et al.  Phishing URL Detection with Oversampling based on Text Generative Adversarial Networks , 2018, 2018 IEEE International Conference on Big Data (Big Data).

[11]  Hai Anh Tran,et al.  A LSTM based framework for handling multiclass imbalance in DGA botnet detection , 2018, Neurocomputing.

[12]  Bo Li,et al.  Adversarial Texts with Gradient Methods , 2018, ArXiv.

[13]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[14]  Christopher Meek,et al.  Adversarial learning , 2005, KDD '05.

[15]  Ulrike Meyer,et al.  FANCI : Feature-based Automated NXDomain Classification and Intelligence , 2018, USENIX Security Symposium.

[16]  Martine De Cock,et al.  Character Level based Detection of DGA Domain Names , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).

[17]  Sandeep Yadav,et al.  Detecting Algorithmically Generated Domain-Flux Attacks With DNS Traffic Analysis , 2012, IEEE/ACM Transactions on Networking.

[18]  Pierre Lison,et al.  Automatic Detection of Malware-Generated Domains with Recurrent Neural Models , 2017, ArXiv.

[19]  Heejo Lee,et al.  PsyBoG: A scalable botnet detection method for large-scale DNS traffic , 2016, Comput. Networks.

[20]  Fabio A. González,et al.  Classifying phishing URLs using recurrent neural networks , 2017, 2017 APWG Symposium on Electronic Crime Research (eCrime).

[21]  Wouter Joosen,et al.  Detection of algorithmically generated domain names used by botnets: a dual arms race , 2019, SAC.

[22]  Konstantin Berlin,et al.  eXpose: A Character-Level Convolutional Neural Network with Embeddings For Detecting Malicious URLs, File Paths and Registry Keys , 2017, ArXiv.

[23]  Martine De Cock,et al.  Dictionary Extraction and Detection of Algorithmically Generated Domain Names in Passive DNS Traffic , 2018, RAID.

[24]  Pierre Lison,et al.  Neural reputation models learned from passive DNS data , 2017, 2017 IEEE International Conference on Big Data (Big Data).

[25]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[26]  Christopher Meek,et al.  Good Word Attacks on Statistical Spam Filters , 2005, CEAS.

[27]  Fernando Pérez-Cruz,et al.  PassGAN: A Deep Learning Approach for Password Guessing , 2017, ACNS.

[28]  Aditi Raghunathan,et al.  Semidefinite relaxations for certifying robustness to adversarial examples , 2018, NeurIPS.

[29]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[30]  Pedro M. Domingos,et al.  Adversarial classification , 2004, KDD.

[31]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[32]  Hyrum S. Anderson,et al.  DeepDGA: Adversarially-Tuned Domain Generation and Detection , 2016, AISec@CCS.

[33]  Martine De Cock,et al.  Inline DGA Detection with Deep Networks , 2017, 2017 IEEE International Conference on Data Mining Workshops (ICDMW).

[34]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[35]  Chris Kanich,et al.  The Long "Taile" of Typosquatting Domain Names , 2014, USENIX Security Symposium.

[36]  Joewie J. Koh,et al.  Inline Detection of Domain Generation Algorithms with Context-Sensitive Word Embeddings , 2018, 2018 IEEE International Conference on Big Data (Big Data).

[37]  Murat Kantarcioglu,et al.  Adversarial Machine Learning , 2018, Adversarial Machine Learning.

[38]  Asaf Shabtai,et al.  MaskDGA: A Black-box Evasion Technique Against DGA Classifiers and Adversarial Defenses , 2019, ArXiv.

[39]  Roberto Perdisci,et al.  From Throw-Away Traffic to Bots: Detecting the Rise of DGA-Based Malware , 2012, USENIX Security Symposium.