Improving ML Detection of IoT Botnets using Comprehensive Data and Feature Sets

In recent times, the world has seen a tremendous increase in the number of attacks on IoT devices. A majority of these attacks have been botnet attacks, where an army of compromised IoT devices is used to launch DDoS attacks on targeted systems. In this paper, we study how the choice of a dataset and the extracted features determine the performance of a Machine Learning model, given the task of classifying Linux Binaries (ELFs) as being benign or malicious. Our work focuses on Linux systems since embedded Linux is the more popular choice for building today’s IoT devices and systems. We propose using 4 different types of files as the dataset for any ML model. These include system files, IoT application files, IoT botnet files and general malware files. Further, we propose using static, dynamic as well as network features to do the classification task. We show that existing methods leave out one or the other features, or file types and hence, our model outperforms them in terms of accuracy in detecting these files. While enhancing the dataset adds to the robustness of a model, utilizing all 3 types of features decreases the false positive and false negative rates non-trivially. We employ an exhaustive scenario based method for evaluating a ML model and show the importance of including each of the proposed files in a dataset. We also analyze the features and try to explain their importance for a model, using observed trends in different benign and malicious files. We perform feature extraction using the open source Limon sandbox, which prior to this work has been tested only on Ubuntu 14. We installed and configured it for Ubuntu 18, the documentation of which has been shared on Github.

[1]  Tsutomu Matsumoto,et al.  IoTPOT: Analysing the Rise of IoT Compromises , 2015, WOOT.

[2]  Elena Sitnikova,et al.  Towards Developing Network forensic mechanism for Botnet Activities in the IoT based on Machine Learning Techniques , 2017, MONAMI.

[3]  Joakim Kävrestad Malware Analysis , 2020, Fundamentals of Digital Forensics.

[4]  Misha Mehra,et al.  Event triggered malware: A new challenge to sandboxing , 2015, 2015 Annual IEEE India Conference (INDICON).

[5]  Nguyen Ngoc Binh,et al.  A Novel Framework to Classify Malware in MIPS Architecture-Based IoT Devices , 2019, Secur. Commun. Networks.

[6]  Tao Wang,et al.  Overview of Embedded Application Development for Intel Architecture , 2014 .

[7]  P. Vinod,et al.  A machine learning approach for linux malware detection , 2014, 2014 International Conference on Issues and Challenges in Intelligent Computing Techniques (ICICT).

[8]  Kouichi Sakurai,et al.  Lightweight Classification of IoT Malware Based on Image Recognition , 2018, 2018 IEEE 42nd Annual Computer Software and Applications Conference (COMPSAC).

[9]  Tyler Moore,et al.  Polymorphic malware detection using sequence classification methods and ensembles , 2017, EURASIP J. Inf. Secur..

[10]  Yanick Fratantonio,et al.  Understanding Linux Malware , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[11]  Young Ho Kim,et al.  An In-Depth Analysis of the Mirai Botnet , 2017, 2017 International Conference on Software Security and Assurance (ICSSA).

[12]  Quoc-Dung Ngo,et al.  PSI-rooted subgraph: A novel feature for IoT botnet detection using classifier algorithms , 2020, ICT Express.

[13]  Aiko Pras,et al.  IoT-Botnet Detection and Isolation by Access Routers , 2018, 2018 9th International Conference on the Network of the Future (NOF).

[14]  Philippe Beaucamps Advanced Polymorphic Techniques , 2007 .

[15]  Quoc-Dung Ngo,et al.  IoT Botnet Detection Approach Based on PSI graph and DGCNN classifier , 2018, 2018 IEEE International Conference on Information Communication and Signal Processing (ICICSP).

[16]  Nguyen Ngoc Binh,et al.  CFDVex: A Novel Feature Extraction Method for Detecting Cross-Architecture IoT Malware , 2019, SoICT.

[17]  Yi Zhou,et al.  Understanding the Mirai Botnet , 2017, USENIX Security Symposium.

[18]  Ítalo S. Cunha,et al.  The Evolution of Bashlite and Mirai IoT Botnets , 2018, 2018 IEEE Symposium on Computers and Communications (ISCC).