TON_IoT Telemetry Dataset: A New Generation Dataset of IoT and IIoT for Data-Driven Intrusion Detection Systems

Although the Internet of Things (IoT) can increase efficiency and productivity through intelligent and remote management, it also increases the risk of cyber-attacks. The potential threats to IoT applications and the need to reduce risk have recently become an interesting research topic. It is crucial that effective Intrusion Detection Systems (IDSs) tailored to IoT applications be developed. Such IDSs require an updated and representative IoT dataset for training and evaluation. However, there is a lack of benchmark IoT and IIoT datasets for assessing IDSs-enabled IoT systems. This paper addresses this issue and proposes a new data-driven IoT/IIoT dataset with the ground truth that incorporates a label feature indicating normal and attack classes, as well as a type feature indicating the sub-classes of attacks targeting IoT/IIoT applications for multi-classification problems. The proposed dataset, which is named TON_IoT, includes Telemetry data of IoT/IIoT services, as well as Operating Systems logs and Network traffic of IoT network, collected from a realistic representation of a medium-scale network at the Cyber Range and IoT Labs at the UNSW Canberra (Australia). This paper also describes the proposed dataset of the Telemetry data of IoT/IIoT services and their characteristics. TON_IoT has various advantages that are currently lacking in the state-of-the-art datasets: i) it has various normal and attack events for different IoT/IIoT services, and ii) it includes heterogeneous data sources. We evaluated the performance of several popular Machine Learning (ML) methods and a Deep Learning model in both binary and multi-class classification problems for intrusion detection purposes using the proposed Telemetry dataset.

[1]  Lav Gupta,et al.  Machine Learning-Based Network Vulnerability Analysis of Industrial Internet of Things , 2019, IEEE Internet of Things Journal.

[2]  Lovekesh Vig,et al.  Long Short Term Memory Networks for Anomaly Detection in Time Series , 2015, ESANN.

[3]  Mazliza Othman,et al.  Internet of Things security: A survey , 2017, J. Netw. Comput. Appl..

[4]  Mohsen Guizani,et al.  Internet of Things: A Survey on Enabling Technologies, Protocols, and Applications , 2015, IEEE Communications Surveys & Tutorials.

[5]  Jiankun Hu,et al.  A holistic review of Network Anomaly Detection Systems: A comprehensive survey , 2019, J. Netw. Comput. Appl..

[6]  Ali Dehghantanha,et al.  A Two-Layer Dimension Reduction and Two-Tier Classification Model for Anomaly-Based Intrusion Detection in IoT Backbone Networks , 2019, IEEE Transactions on Emerging Topics in Computing.

[7]  Marimuthu Palaniswami,et al.  Labelled data collection for anomaly detection in wireless sensor networks , 2010, 2010 Sixth International Conference on Intelligent Sensors, Sensor Networks and Information Processing.

[8]  Howard Shrobe,et al.  IIoT Cybersecurity Risk Modeling for SCADA Systems , 2018, IEEE Internet of Things Journal.

[9]  Georgios Kambourakis,et al.  Intrusion Detection in 802.11 Networks: Empirical Evaluation of Threats and a Public Dataset , 2016, IEEE Communications Surveys & Tutorials.

[10]  Naveen K. Chilamkurti,et al.  Deep Learning: The Frontier for Distributed Attack Detection in Fog-to-Things Computing , 2018, IEEE Communications Magazine.

[11]  Sean Carlisto de Alvarenga,et al.  A survey of intrusion detection in Internet of Things , 2017, J. Netw. Comput. Appl..

[12]  Michael Backes,et al.  Identifying the Scan and Attack Infrastructures Behind Amplification DDoS Attacks , 2016, CCS.

[13]  Elena Sitnikova,et al.  Towards the Development of Realistic Botnet Dataset in the Internet of Things for Network Forensic Analytics: Bot-IoT Dataset , 2018, Future Gener. Comput. Syst..

[14]  Vijay Sivaraman,et al.  Classifying IoT Devices in Smart Environments Using Network Traffic Characteristics , 2019, IEEE Transactions on Mobile Computing.

[15]  Philip S. Yu,et al.  Top 10 algorithms in data mining , 2007, Knowledge and Information Systems.

[16]  Theophilus A. Benson,et al.  Detecting Volumetric Attacks on loT Devices via SDN-Based Monitoring of MUD Activity , 2019, SOSR.

[17]  Thiemo Voigt,et al.  SVELTE: Real-time intrusion detection in the Internet of Things , 2013, Ad Hoc Networks.

[18]  André C. Drummond,et al.  A Survey of Random Forest Based Methods for Intrusion Detection Systems , 2018, ACM Comput. Surv..

[19]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[20]  Yang Xiang,et al.  A survey on security control and attack detection for industrial cyber-physical systems , 2018, Neurocomputing.

[21]  Naveen K. Chilamkurti,et al.  Distributed attack detection scheme using deep learning approach for Internet of Things , 2017, Future Gener. Comput. Syst..

[22]  Yi Zhou,et al.  Understanding the Mirai Botnet , 2017, USENIX Security Symposium.

[23]  So Young Sohn,et al.  Random effects logistic regression model for anomaly detection , 2007, Expert Syst. Appl..

[24]  Saman Taghavi Zargar,et al.  A Survey of Defense Mechanisms Against Distributed Denial of Service (DDoS) Flooding Attacks , 2013, IEEE Communications Surveys & Tutorials.

[25]  Isabelle Guyon,et al.  A Scaling Law for the Validation-Set Training-Set Size Ratio , 1997 .

[26]  Neelam Sharma,et al.  INTRUSION DETECTION USING NAIVE BAYES CLASSIFIER WITH FEATURE REDUCTION , 2012 .

[27]  Marimuthu Palaniswami,et al.  Internet of Things (IoT): A vision, architectural elements, and future directions , 2012, Future Gener. Comput. Syst..

[28]  Parvez Faruki,et al.  Network Intrusion Detection for IoT Security Based on Learning Techniques , 2019, IEEE Communications Surveys & Tutorials.

[29]  Muna Al-Hawawreh,et al.  Targeted Ransomware: A New Cyber Threat to Edge System of Brownfield Industrial Internet of Things , 2019, IEEE Internet of Things Journal.

[30]  Gordon Fyodor Lyon,et al.  Nmap Network Scanning: The Official Nmap Project Guide to Network Discovery and Security Scanning , 2009 .

[31]  Ali A. Ghorbani,et al.  Toward Generating a New Intrusion Detection Dataset and Intrusion Traffic Characterization , 2018, ICISSP.

[32]  Lalu Banoth,et al.  A Survey of Data Mining and Machine Learning Methods for Cyber Security Intrusion Detection , 2017 .

[33]  Kwangjo Kim,et al.  Deep Abstraction and Weighted Feature Selection for Wi-Fi Impersonation Detection , 2018, IEEE Transactions on Information Forensics and Security.

[34]  Sushanta Karmakar,et al.  Intrusion Detection Systems using Linear Discriminant Analysis and Logistic Regression , 2015, 2015 Annual IEEE India Conference (INDICON).

[35]  Bander Ali Saleh Al-rimy,et al.  Ransomware threat success factors, taxonomy, and countermeasures: A survey and research directions , 2018, Comput. Secur..

[36]  Jiankun Hu,et al.  A New Threat Intelligence Scheme for Safeguarding Industry 4.0 Systems , 2018, IEEE Access.

[37]  Albert Y. Zomaya,et al.  A Dimension Reduction Model and Classifier for Anomaly-Based Intrusion Detection in Internet of Things , 2017, 2017 IEEE 15th Intl Conf on Dependable, Autonomic and Secure Computing, 15th Intl Conf on Pervasive Intelligence and Computing, 3rd Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress(DASC/PiCom/DataCom/CyberSciTech).

[38]  Song Han,et al.  Industrial Internet of Things: Challenges, Opportunities, and Directions , 2018, IEEE Transactions on Industrial Informatics.

[39]  Nour Moustafa,et al.  UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set) , 2015, 2015 Military Communications and Information Systems Conference (MilCIS).

[40]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[41]  Shadi A. Aljawarneh,et al.  GARUDA: Gaussian dissimilarity measure for feature representation and anomaly detection in Internet of things , 2018, The Journal of Supercomputing.

[42]  Zhihua Qu,et al.  Resilient Reinforcement in Secure State Estimation Against Sensor Attacks With A Priori Information , 2019, IEEE Transactions on Automatic Control.

[43]  Ali A. Ghorbani,et al.  A detailed analysis of the KDD CUP 99 data set , 2009, 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications.

[44]  Ivan Stojmenovic,et al.  The Fog computing paradigm: Scenarios and security issues , 2014, 2014 Federated Conference on Computer Science and Information Systems.

[45]  Mohsen Guizani,et al.  Deep Learning for IoT Big Data and Streaming Analytics: A Survey , 2017, IEEE Communications Surveys & Tutorials.

[46]  Andrew W. Senior,et al.  Long short-term memory recurrent neural network architectures for large scale acoustic modeling , 2014, INTERSPEECH.

[47]  Wu He,et al.  Internet of Things in Industries: A Survey , 2014, IEEE Transactions on Industrial Informatics.

[48]  Aurélien Géron,et al.  Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems , 2017 .