A Visualized Botnet Detection System Based Deep Learning for the Internet of Things Networks of Smart Cities

Internet of Things applications for smart cities have currently become a primary target for advanced persistent threats of botnets. This article proposes a botnet detection system based on a two-level deep learning framework for semantically discriminating botnets and legitimate behaviors at the application layer of the domain name system (DNS) services. In the first level of the framework, the similarity measures of DNS queries are estimated using siamese networks based on a predefined threshold for selecting the most frequent DNS information across Ethernet connections. In the second level of the framework, a domain generation algorithm based on deep learning architectures is suggested for categorizing normal and abnormal domain names. The framework is highly scalable on a commodity hardware server due to its potential design of analyzing DNS data. The proposed framework was evaluated using two datasets and was compared with recent deep learning models. Various visualization methods were also employed to understand the characteristics of the dataset and to visualize the embedding features. The experimental results revealed substantial improvements in terms of F1-score, speed of detection, and false alarm rate.

[1]  K. P. Soman,et al.  Detecting malicious domain names using deep learning approaches at scale , 2018, J. Intell. Fuzzy Syst..

[2]  Sabu M. Thampi,et al.  AmritaDGA: a comprehensive data set for domain generation algorithms (DGAs) based domain name detection systems and application of deep learning , 2019, Big Data Recommender Systems - Volume 2: Application Paradigms.

[3]  Yann LeCun,et al.  Signature Verification Using A "Siamese" Time Delay Neural Network , 1993, Int. J. Pattern Recognit. Artif. Intell..

[4]  M. Maio,et al.  Democratization or censorship?: Argentina’s newspaper coverage of the media reform , 2018 .

[5]  Arkady B. Zaslavsky,et al.  Context Aware Computing for The Internet of Things: A Survey , 2013, IEEE Communications Surveys & Tutorials.

[6]  Mamoun Alazab,et al.  Profiling and classifying the behavior of malicious codes , 2015, J. Syst. Softw..

[7]  Md. Rafiqul Islam,et al.  Hybrids of support vector machine wrapper and filter based framework for malware detection , 2016, Future Gener. Comput. Syst..

[8]  Zhong Zhou,et al.  Tweet2Vec: Character-Based Distributed Representations for Social Media , 2016, ACL.

[9]  Konstantin Berlin,et al.  eXpose: A Character-Level Convolutional Neural Network with Embeddings For Detecting Malicious URLs, File Paths and Registry Keys , 2017, ArXiv.

[10]  Hyrum S. Anderson,et al.  Predicting Domain Generation Algorithms with Long Short-Term Memory Networks , 2016, ArXiv.

[11]  A. S. M. Kayes,et al.  Critical situation management utilizing IoT-based data resources through dynamic contextual role modeling and activation , 2018, Computing.

[12]  K. P. Soman,et al.  Evaluating deep learning approaches to characterize and classify the DGAs at scale , 2018, J. Intell. Fuzzy Syst..

[13]  Roberto Perdisci,et al.  From Throw-Away Traffic to Bots: Detecting the Rise of DGA-Based Malware , 2012, USENIX Security Symposium.

[14]  M. Alazab,et al.  Deep Learning Applications for Cyber Security , 2019, Advanced Sciences and Technologies for Security Applications.

[15]  Klemen Kenda,et al.  Streaming Data Fusion for the Internet of Things , 2019, Sensors.

[16]  Soroush Vosoughi,et al.  Tweet2Vec: Learning Tweet Embeddings Using Character-level CNN-LSTM Encoder-Decoder , 2016, SIGIR.

[17]  Mattia Zago,et al.  Scalable detection of botnets based on DGA , 2020 .

[18]  Martine De Cock,et al.  Character Level based Detection of DGA Domain Names , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).

[19]  Ryan R. Curtin,et al.  Detecting DGA domains with recurrent neural networks and side information , 2018, ARES.

[20]  Sandeep Yadav,et al.  Detecting Algorithmically Generated Domain-Flux Attacks With DNS Traffic Analysis , 2012, IEEE/ACM Transactions on Networking.

[21]  Christian Rossow,et al.  RUHR-UNIVERSITÄT BOCHUM , 2014 .

[22]  S. Rahimifard,et al.  Unlocking the Potential of the Internet of Things to Improve Resource Efficiency in Food Supply Chains , 2017, Innovative Approaches and Applications for Sustainable Rural Development.

[23]  R. Vinayakumar,et al.  Siamese neural network architecture for homoglyph attacks detection , 2020, ICT Express.

[24]  Stephen Morris,et al.  Typo-Squatting: The Curse'' of Popularity , 2009 .

[25]  Martine De Cock,et al.  Inline DGA Detection with Deep Networks , 2017, 2017 IEEE International Conference on Data Mining Workshops (ICDMW).

[26]  Hai Anh Tran,et al.  DGA Botnet Detection Using Supervised Learning Methods , 2017, SoICT.

[27]  Alireza Jolfaei,et al.  Ransomware Triage Using Deep Learning: Twitter as a Case Study , 2019, 2019 Cybersecurity and Cyberforensics Conference (CCC).

[28]  Hai Anh Tran,et al.  A LSTM based framework for handling multiclass imbalance in DGA botnet detection , 2018, Neurocomputing.

[29]  Hyrum S. Anderson,et al.  Detecting Homoglyph Attacks with a Siamese Neural Network , 2018, 2018 IEEE Security and Privacy Workshops (SPW).

[30]  Prabaharan Poornachandran,et al.  Scalable Framework for Cyber Threat Situational Awareness Based on Domain Name Systems Data Analysis , 2018 .

[31]  A. S. M. Kayes,et al.  ISDI: A New Window-Based Framework for Integrating IoT Streaming Data from Multiple Sources , 2019, AINA.

[32]  Pierre Lison,et al.  Automatic Detection of Malware-Generated Domains with Recurrent Neural Models , 2017, ArXiv.

[33]  Elena Sitnikova,et al.  Towards the Development of Realistic Botnet Dataset in the Internet of Things for Network Forensic Analytics: Bot-IoT Dataset , 2018, Future Gener. Comput. Syst..

[34]  Mamoun Alazab,et al.  Securing smart vehicles from relay attacks using machine learning , 2019, The Journal of Supercomputing.

[35]  Yuval Elovici,et al.  N-BaIoT—Network-Based Detection of IoT Botnet Attacks Using Deep Autoencoders , 2018, IEEE Pervasive Computing.

[36]  Quan Z. Sheng,et al.  Machine Learning for Computer and Cyber Security , 2019 .

[37]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[38]  P SomanK.,et al.  S.P.O.O.F Net: Syntactic Patterns for identification of Ominous Online Factors , 2018, 2018 IEEE Security and Privacy Workshops (SPW).

[39]  Alireza Jolfaei,et al.  DBD: Deep Learning DGA-Based Botnet Detection , 2019, Deep Learning Applications for Cyber Security.

[40]  Hongli Zhang,et al.  A Face Emotion Recognition Method Using Convolutional Neural Network and Image Edge Computing , 2019, IEEE Access.

[41]  Zeng Feng,et al.  Classification for DGA-Based Malicious Domain Names with Deep Learning Architectures , 2017 .

[42]  J. Wenny Rahayu,et al.  A Policy Model and Framework for Context-Aware Access Control to Information Resources , 2017, ArXiv.

[43]  Guoliang Li,et al.  Fast-join: An efficient method for fuzzy token matching based string similarity join , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[44]  J. Wenny Rahayu,et al.  Accessing Data from Multiple Sources Through Context-Aware Access Control , 2018, 2018 17th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/ 12th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE).

[45]  Ronaldo M. Salles,et al.  Botnets: A survey , 2013, Comput. Networks.

[46]  Sandeep Yadav,et al.  Detecting algorithmically generated malicious domain names , 2010, IMC '10.

[47]  Hyrum S. Anderson,et al.  DeepDGA: Adversarially-Tuned Domain Generation and Detection , 2016, AISec@CCS.

[48]  Xiang Zhang,et al.  Character-level Convolutional Networks for Text Classification , 2015, NIPS.

[49]  Wen-Syan Li,et al.  Top-k string similarity search with edit-distance constraints , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[50]  Dhananjay Singh,et al.  Binary cuckoo search metaheuristic-based supercomputing framework for human behavior analysis in smart home , 2019, The Journal of Supercomputing.

[51]  Prabaharan Poornachandran,et al.  ScaleNet: Scalable and Hybrid Frameworkfor Cyber Threat Situational AwarenessBased on DNS, URL, and Email Data Analysis , 2019, J. Cyber Secur. Mobil..

[52]  Ulrike Meyer,et al.  FANCI : Feature-based Automated NXDomain Classification and Intelligence , 2018, USENIX Security Symposium.

[53]  Guoliang Li,et al.  A pivotal prefix based filtering algorithm for string similarity search , 2014, SIGMOD Conference.