A Multi-Dimensional Deep Learning Framework for IoT Malware Classification and Family Attribution

The emergence of Internet of Things malware, which leverages exploited IoT devices to perform large-scale cyber attacks (e.g., Mirai botnet), is considered as a major threat to the Internet ecosystem. To mitigate such threat, there is an utmost need for effective IoT malware classification and family attribution, which provide essential steps towards initiating attack mitigation/prevention countermeasures. In this paper, motivated by the lack of sophisticated malware obfuscation in the implementation of IoT malware, we utilize features extracted from strings- and image-based representations of the executable binaries to propose a novel multi-dimensional classification approach using Deep Learning (DL) architectures. To this end, we analyze more than 70,000 recently detected IoT malware samples. Our in-depth experiments with four prominent IoT malware families highlight the significant accuracy of the approach (99.78%), which outperforms conventional single-level classifiers. Additionally, we utilize our IoT-tailored approach for labeling newly detected “unknown” malware samples, which were mainly attributed to a few predominant families. Finally, this work contributes to the security of future networks (e.g., 5G) through the implementation of effective tools/techniques for timely IoT malware classification, and attack mitigation.

[1]  Bartosz Krawczyk,et al.  Learning from imbalanced data: open challenges and future directions , 2016, Progress in Artificial Intelligence.

[2]  Yun Shen,et al.  Before Toasters Rise Up: A View into the Emerging IoT Threat Landscape , 2018, RAID.

[3]  J. Doug Tygar,et al.  Adversarial machine learning , 2019, AISec '11.

[4]  Mitchell Mays,et al.  Feature Selection for Malware Classification , 2017, MAICS.

[5]  Jon Kleinberg,et al.  Transfusion: Understanding Transfer Learning for Medical Imaging , 2019, NeurIPS.

[6]  Yanick Fratantonio,et al.  Understanding Linux Malware , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[7]  Christian Doerr,et al.  Examining Mirai's Battle over the Internet of Things , 2020, CCS.

[8]  Md. Rafiqul Islam,et al.  An automated classification system based on the strings of trojan and virus families , 2009, 2009 4th International Conference on Malicious and Unwanted Software (MALWARE).

[9]  Yongxin Feng,et al.  A Malware Detection Method of Code Texture Visualization Based on an Improved Faster RCNN Combining Transfer Learning , 2020, IEEE Access.

[10]  Dong Jin,et al.  Classifying Malware Represented as Control Flow Graphs using Deep Graph Convolutional Neural Network , 2019, 2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).

[11]  Daniel Gibert,et al.  HYDRA: A multimodal deep learning framework for malware classification , 2020, Comput. Secur..

[12]  Mansour Ahmadi,et al.  Novel Feature Extraction, Selection and Fusion for Effective Malware Family Classification , 2015, CODASPY.

[13]  Quoc-Dung Ngo,et al.  IoT Botnet Detection Approach Based on PSI graph and DGCNN classifier , 2018, 2018 IEEE International Conference on Information Communication and Signal Processing (ICICSP).

[14]  Abdulrahman Alruban,et al.  IoT Malware Network Traffic Classification using Visual Representation and Deep Learning , 2020, 2020 6th IEEE Conference on Network Softwarization (NetSoft).

[15]  B. S. Manjunath,et al.  Malware images: visualization and automatic classification , 2011, VizSec '11.

[16]  Luiz Eduardo Soares de Oliveira,et al.  L(a)ying in (Test)Bed - How Biased Datasets Produce Impractical Results for Actual Malware Families' Classification , 2019, ISC.

[17]  Nasir Ghani,et al.  On data-driven curation, learning, and analysis for inferring evolving internet-of-Things (IoT) botnets in the wild , 2020, Comput. Secur..

[18]  Niels Provos,et al.  A Virtual Honeypot Framework , 2004, USENIX Security Symposium.

[19]  Kouichi Sakurai,et al.  Lightweight Classification of IoT Malware Based on Image Recognition , 2018, 2018 IEEE 42nd Annual Computer Software and Applications Conference (COMPSAC).

[20]  Md. Rafiqul Islam,et al.  Classification of malware based on integrated static and dynamic features , 2013, J. Netw. Comput. Appl..

[21]  Jianguo Jiang,et al.  Using Multi-features and Ensemble Learning Method for Imbalanced Malware Classification , 2016, 2016 IEEE Trustcom/BigDataSE/ISPA.

[22]  Tsutomu Matsumoto,et al.  IoTPOT: Analysing the Rise of IoT Compromises , 2015, WOOT.

[23]  Ong Bi Lynn,et al.  Internet of Things (IoT): Taxonomy of security attacks , 2016, 2016 3rd International Conference on Electronic Design (ICED).

[24]  Abien Fred Agarap Deep Learning using Rectified Linear Units (ReLU) , 2018, ArXiv.

[25]  Axel Legay,et al.  Detection of Mirai by Syntactic and Semantic Analysis , 2017 .

[26]  Yi Zhou,et al.  Understanding the Mirai Botnet , 2017, USENIX Security Symposium.

[27]  Juan Caballero,et al.  AVclass: A Tool for Massive Malware Labeling , 2016, RAID.

[28]  Ahmed A. Abusnaina,et al.  Soteria: Detecting Adversarial Examples in Control Flow Graph-based Malware Classifiers , 2020, IEEE International Conference on Distributed Computing Systems.

[29]  Daniel Gibert,et al.  Using convolutional neural networks for classification of malware represented as images , 2018, Journal of Computer Virology and Hacking Techniques.

[30]  Robert M. Haralick,et al.  Textural Features for Image Classification , 1973, IEEE Trans. Syst. Man Cybern..

[31]  Mourad Debbabi,et al.  Inferring and Investigating IoT-Generated Scanning Campaigns Targeting a Large Network Telescope , 2022, IEEE Transactions on Dependable and Secure Computing.

[32]  Kangbin Yim,et al.  Malware Obfuscation Techniques: A Brief Survey , 2010, 2010 International Conference on Broadband, Wireless Computing, Communication and Applications.

[33]  Lars Schmidt-Thieme,et al.  Cost-sensitive learning methods for imbalanced data , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[34]  Ning Zhang,et al.  Efficient Signature Generation for Classifying Cross-Architecture IoT Malware , 2018, 2018 IEEE Conference on Communications and Network Security (CNS).

[35]  Leyla Bilge,et al.  The Tangled Genealogy of IoT Malware , 2020, ACSAC.

[36]  Lorenzo Cavallaro,et al.  TESSERACT: Eliminating Experimental Bias in Malware Classification across Space and Time , 2018, USENIX Security Symposium.