Detecting Multielement Algorithmically Generated Domain Names Based on Adaptive Embedding Model

With the development of detection algorithms on malicious dynamic domain names, domain generation algorithms have developed to be more stealthy. The use of multiple elements for generating domains will lead to higher detection difficulty. To effectively improve the detection accuracy of algorithmically generated domain names based on multiple elements, a domain name syntax model is proposed, which analyzes the multiple elements in domain names and their syntactic relationship, and an adaptive embedding method is proposed to achieve effective element parsing of domain names. A parallel convolutional model based on the feature selection module combined with an improved dynamic loss function based on curriculum learning is proposed, which can achieve effective detection on multielement malicious domain names. A series of experiments are designed and the proposed model is compared with five previous algorithms. The experimental results denote that the detection accuracy of the proposed model for multiple-element malicious domain names is significantly higher than that of the comparison algorithms and also has good adaptability to other types of malicious domain names.

[1]  Gregorio Martínez Pérez,et al.  UMUDGA: A dataset for profiling DGA-based botnet , 2020, Comput. Secur..

[2]  Arun Kumar Sangaiah,et al.  DGA Domain Name Classification Method Based on Long Short-Term Memory with Attention Mechanism , 2019, Applied Sciences.

[3]  Sandeep Yadav,et al.  Detecting algorithmically generated malicious domain names , 2010, IMC '10.

[4]  Yuan Yan Tang,et al.  A Risk Management Approach to Defending Against the Advanced Persistent Threat , 2020, IEEE Transactions on Dependable and Secure Computing.

[5]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Jeff A. Bilmes,et al.  Minimax Curriculum Learning: Machine Teaching with Desirable Difficulties and Scheduled Diversity , 2018, ICLR.

[7]  Martine De Cock,et al.  Character Level based Detection of DGA Domain Names , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).

[8]  Jingxuan Sun,et al.  Stealthy Domain Generation Algorithms , 2017, IEEE Transactions on Information Forensics and Security.

[9]  Martine De Cock,et al.  Inline DGA Detection with Deep Networks , 2017, 2017 IEEE International Conference on Data Mining Workshops (ICDMW).

[10]  Leyla Bilge,et al.  Exposure: A Passive DNS Analysis Service to Detect and Report Malicious Domains , 2014, TSEC.

[11]  Roberto Perdisci,et al.  From Throw-Away Traffic to Bots: Detecting the Rise of DGA-Based Malware , 2012, USENIX Security Symposium.

[12]  Hyrum S. Anderson,et al.  Predicting Domain Generation Algorithms with Long Short-Term Memory Networks , 2016, ArXiv.

[13]  Xin Du,et al.  Detection method of domain names generated by DGAs based on semantic representation and deep neural network , 2019, Comput. Secur..

[14]  Jian Yang,et al.  Selective Kernel Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Yuewei Dai,et al.  Detecting Stealthy Domain Generation Algorithms Using Heterogeneous Deep Neural Network Framework , 2020, IEEE Access.

[16]  Yongzheng Zhang,et al.  Khaos: An Adversarial Neural Network DGA With High Anti-Detection Ability , 2020, IEEE Transactions on Information Forensics and Security.

[17]  Jiyong Jang,et al.  Scalable analytics to detect DNS misuse for establishing stealthy communication channels , 2016, IBM J. Res. Dev..

[18]  Weiwei Liu,et al.  Detecting Word-Based Algorithmically Generated Domains Using Semantic Analysis , 2019, Symmetry.

[19]  Hai Anh Tran,et al.  A LSTM based framework for handling multiclass imbalance in DGA botnet detection , 2018, Neurocomputing.

[20]  Sandeep Yadav,et al.  Detecting Algorithmically Generated Domain-Flux Attacks With DNS Traffic Analysis , 2012, IEEE/ACM Transactions on Networking.

[21]  Sandeep Yadav,et al.  Winning with DNS Failures: Strategies for Faster Botnet Detection , 2011, SecureComm.

[22]  Mattia Zago,et al.  Scalable detection of botnets based on DGA , 2020 .

[23]  George Kesidis,et al.  Unsupervised, low latency anomaly detection of algorithmically generated domain names by generative probabilistic modeling , 2014, Journal of advanced research.

[24]  Sanmeet Kaur,et al.  Issues and challenges in DNS based botnet detection: A survey , 2019, Comput. Secur..

[25]  Miranda Mowbray,et al.  Finding Domain-Generation Algorithms by Looking at Length Distribution , 2014, 2014 IEEE International Symposium on Software Reliability Engineering Workshops.

[26]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[27]  Sherali Zeadally,et al.  A Taxonomy of Domain-Generation Algorithms , 2016, IEEE Security & Privacy.

[28]  Heejo Lee,et al.  PsyBoG: A scalable botnet detection method for large-scale DNS traffic , 2016, Comput. Networks.

[29]  Stefano Zanero,et al.  Phoenix: DGA-Based Botnet Tracking and Intelligence , 2014, DIMVA.

[30]  Guang Cheng,et al.  Detecting domain-flux botnet based on DNS traffic features in managed network , 2016, Secur. Commun. Networks.

[31]  John Aycock,et al.  Kwyjibo: automatic domain name generation , 2008, Softw. Pract. Exp..

[32]  Yong Shi,et al.  Malicious Domain Name Detection Based on Extreme Machine Learning , 2017, Neural Processing Letters.