MQTTset, a New Dataset for Machine Learning Techniques on MQTT

IoT networks are increasingly popular nowadays to monitor critical environments of different nature, significantly increasing the amount of data exchanged. Due to the huge number of connected IoT devices, security of such networks and devices is therefore a critical issue. Detection systems assume a crucial role in the cyber-security field: based on innovative algorithms such as machine learning, they are able to identify or predict cyber-attacks, hence to protect the underlying system. Nevertheless, specific datasets are required to train detection models. In this work we present MQTTset, a dataset focused on the MQTT protocol, widely adopted in IoT networks. We present the creation of the dataset, also validating it through the definition of a hypothetical detection system, by combining the legitimate dataset with cyber-attacks against the MQTT network. Obtained results demonstrate how MQTTset can be used to train machine learning models to implement detection systems able to protect IoT contexts.

[1]  Pelin Angin,et al.  ARTEMIS: An Intrusion Detection System for MQTT Attacks in Internet of Things , 2019, 2019 38th Symposium on Reliable Distributed Systems (SRDS).

[2]  Xin Liu,et al.  Chiron: Concurrent High Throughput Communication for IoT Devices , 2018, MobiSys.

[3]  Longqing Li,et al.  Wireless Sensor Networks Intrusion Detection Based on SMOTE and the Random Forest Algorithm , 2019, Sensors.

[4]  Aman Jantan,et al.  A new approach for intrusion detection system based on training multilayer perceptron by using enhanced Bat algorithm , 2019, Neural Computing and Applications.

[5]  Hyeonwoo Kim,et al.  Correlation analysis of MQTT loss and delay according to QoS level , 2013, The International Conference on Information Networking 2013 (ICOIN).

[6]  François Chollet,et al.  Deep Learning mit Python und Keras , 2018 .

[7]  Kim-Kwang Raymond Choo,et al.  An Ensemble Intrusion Detection Technique Based on Proposed Statistical Flow Features for Protecting Network Traffic of Internet of Things , 2019, IEEE Internet of Things Journal.

[8]  Oleksandr Semeniuta,et al.  MEML: Resource-aware MQTT-based Machine Learning for Network Attacks Detection on IoT Edge Devices , 2019, UCC Companion.

[9]  Giovanni Chiola,et al.  Designing and Modeling the Slow Next DoS Attack , 2015, CISIS-ICEUTE.

[10]  Sven Nomm,et al.  MedBIoT: Generation of an IoT Botnet Dataset in a Medium-sized IoT Network , 2020, ICISSP.

[11]  M. M. Saritas,et al.  Performance Analysis of ANN and Naive Bayes Classification Algorithm for Data Classification , 2019 .

[12]  Sanjay Misra,et al.  Effect of Feature Selection on Performance of Internet Traffic Classification on NIMS Multi-Class dataset , 2019 .

[13]  Zaffar Haider Janjua,et al.  Passban IDS: An Intelligent Anomaly-Based Intrusion Detection System for IoT Edge Devices , 2020, IEEE Internet of Things Journal.

[14]  Ravi Kishore Kodali,et al.  MQTT based home automation system using ESP8266 , 2016, 2016 IEEE Region 10 Humanitarian Technology Conference (R10-HTC).

[15]  Panos M. Pardalos,et al.  A novel perspective on multiclass classification: Regular simplex support vector machine , 2019, Inf. Sci..

[16]  Arvind R. Bhagat Patil,et al.  Mitigation Against Denial-of-Service Flooding and Malformed Packet Attacks , 2019 .

[17]  Mohamed Amine Ferrag,et al.  Deep learning for cyber security intrusion detection: Approaches, datasets, and comparative study , 2020, J. Inf. Secur. Appl..

[18]  Zhifeng Zhao,et al.  AI-Based Two-Stage Intrusion Detection for Software Defined IoT Networks , 2018, IEEE Internet of Things Journal.

[19]  F. Sandu,et al.  Solutions for deep packet inspection in industrial communications , 2016, 2016 International Conference on Communications (COMM).

[20]  Mohammad Sayad Haghighi,et al.  Artificial Intelligence for Detection, Estimation, and Compensation of Malicious Attacks in Nonlinear Cyber-Physical Systems and Industrial IoT , 2020, IEEE Transactions on Industrial Informatics.

[21]  Haipeng Yao,et al.  MSML: A Novel Multilevel Semi-Supervised Machine Learning Framework for Intrusion Detection System , 2019, IEEE Internet of Things Journal.

[22]  Faisal Hussain,et al.  IoT-Flock: An Open-source Framework for IoT Traffic Generation , 2020, 2020 International Conference on Emerging Trends in Smart Technologies (ICETST).

[23]  Iqbal H. Sarker,et al.  Cyber Intrusion Detection Using Machine Learning Classification Techniques , 2020, COMS2.

[24]  Henry Leung,et al.  A Deep and Scalable Unsupervised Machine Learning System for Cyber-Attack Detection in Large-Scale Smart Grids , 2019, IEEE Access.

[25]  Nour Moustafa,et al.  UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set) , 2015, 2015 Military Communications and Information Systems Conference (MilCIS).

[26]  Hadis Karimipour,et al.  Cyber intrusion detection by combined feature selection algorithm , 2019, J. Inf. Secur. Appl..

[27]  Lav Gupta,et al.  Machine Learning-Based Network Vulnerability Analysis of Industrial Internet of Things , 2019, IEEE Internet of Things Journal.

[28]  Valeriy Martynyuk,et al.  Technique for IoT Cyberattacks Detection Based on DNS Traffic Analysis , 2020, IntelITSIS.

[29]  J.A. Stankovic,et al.  Denial of Service in Sensor Networks , 2002, Computer.

[30]  Yuval Elovici,et al.  N-BaIoT—Network-Based Detection of IoT Botnet Attacks Using Deep Autoencoders , 2018, IEEE Pervasive Computing.

[31]  Ding Yi,et al.  Design and implementation of mobile health monitoring system based on MQTT protocol , 2016, 2016 IEEE Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC).

[32]  Maurizio Aiello,et al.  Evaluating Security of Low-Power Internet of Things Networks , 2019, International Journal of Computing and Digital Systems.

[33]  Maurizio Aiello,et al.  Remotely Exploiting AT Command Attacks on ZigBee Networks , 2017, Secur. Commun. Networks.

[34]  Joarder Kamruzzaman,et al.  A novel Ensemble of Hybrid Intrusion Detection System for Detecting Internet of Things Attacks , 2019, Electronics.

[35]  Zied Elouedi,et al.  Naive Bayes vs decision trees in intrusion detection systems , 2004, SAC '04.

[36]  Pete Burnap,et al.  A Supervised Intrusion Detection System for Smart Home IoT Devices , 2019, IEEE Internet of Things Journal.

[37]  Nishant Shukla Machine Learning with TensorFlow , 2018 .

[38]  V. D. Nandavadekar,et al.  Efficient algorithm for intrusion attack classification by analyzing KDD Cup 99 , 2012, 2012 Ninth International Conference on Wireless and Optical Communications Networks (WOCN).

[39]  Silvio Ranise,et al.  MQTTSA: A Tool for Automatically Assisting the Secure Deployments of MQTT Brokers , 2019, 2019 IEEE World Congress on Services (SERVICES).

[40]  Maurizio Aiello,et al.  Taxonomy of Slow DoS Attacks to Web Applications , 2012, SNDS.

[41]  Héctor Alaiz-Moretón,et al.  Multiclass Classification Procedure for Detecting Attacks on MQTT-IoT Protocol , 2019, Complex..

[42]  Bayu Adhi Tama,et al.  An in-depth experimental study of anomaly detection using gradient boosted machine , 2017, Neural Computing and Applications.

[43]  Nima Jafari Navimipour,et al.  Intrusion detection for cloud computing using neural networks and artificial bee colony optimization algorithm , 2019, ICT Express.

[44]  Anatoliy Sachenko,et al.  Deep Neural Network for Detection of Cyber Attacks , 2018, 2018 IEEE First International Conference on System Analysis & Intelligent Computing (SAIC).

[45]  M. M. A. Hashem,et al.  Attack and anomaly detection in IoT sensors in IoT sites using machine learning approaches , 2019, Internet Things.

[46]  Maurizio Aiello,et al.  SlowITe, a Novel Denial of Service Attack Affecting MQTT , 2020, Sensors.

[47]  Deris Stiawan,et al.  Investigating Brute Force Attack Patterns in IoT Network , 2019, J. Electr. Comput. Eng..

[48]  Ahmad Y. Javaid,et al.  A Real-World Password Cracking Demonstration Using Open Source Tools for Instructional Use , 2018, 2018 IEEE International Conference on Electro/Information Technology (EIT).

[49]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[50]  Pablo N. Mendes,et al.  Improving efficiency and accuracy in multilingual entity extraction , 2013, I-SEMANTICS '13.

[51]  Balasubramanian Raman,et al.  Anomaly based intrusion detection using filter based feature selection on KDD-CUP 99 , 2017, TENCON 2017 - 2017 IEEE Region 10 Conference.

[52]  Diego Fernández Iglesias,et al.  Anomaly Detection in IoT: Methods, Techniques and Tools , 2019, Proceedings.

[53]  Satish R. Kolhe,et al.  Survey on Intrusion Detection System using Machine Learning Techniques , 2013 .

[54]  Antonio Robles-Kelly,et al.  Deep Learning-Based Intrusion Detection for IoT Networks , 2019, 2019 IEEE 24th Pacific Rim International Symposium on Dependable Computing (PRDC).

[55]  Ahmed Ahmim,et al.  A Novel Hierarchical Intrusion Detection System Based on Decision Tree and Rules-Based Models , 2018, 2019 15th International Conference on Distributed Computing in Sensor Systems (DCOSS).

[56]  Hoan-Suk Choi,et al.  IoT home gateway for auto-configuration and management of MQTT devices , 2015, 2015 IEEE Conference on Wireless Sensors (ICWiSe).

[57]  Banu Diri,et al.  Investigating the effect of dataset size, metrics sets, and feature selection techniques on software fault prediction problem , 2009, Inf. Sci..

[58]  Elena Sitnikova,et al.  Towards the Development of Realistic Botnet Dataset in the Internet of Things for Network Forensic Analytics: Bot-IoT Dataset , 2018, Future Gener. Comput. Syst..

[59]  Giovanni Chiola,et al.  Slow DoS attacks: definition and categorisation , 2013, Int. J. Trust. Manag. Comput. Commun..

[60]  Khalid Chougdali,et al.  An effective cyber attack detection system based on an improved OMPCA , 2017, 2017 International Conference on Wireless Networks and Mobile Communications (WINCOM).

[61]  Ali A. Ghorbani,et al.  A detailed analysis of the KDD CUP 99 data set , 2009, 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications.