Annotated Dataset for Anomaly Detection in a Data Center with IoT Sensors

The relative simplicity of IoT networks extends service vulnerabilities and possibilities to different network failures exhibiting system weaknesses. Therefore, having a dataset with a sufficient number of samples, labeled and with a systematic analysis, is essential in order to understand how these networks behave and detect traffic anomalies. This work presents DAD: a complete and labeled IoT dataset containing a reproduction of certain real-world behaviors as seen from the network. To approximate the dataset to a real environment, the data were obtained from a physical data center, with temperature sensors based on NFC smart passive sensor technology. Having carried out different approaches, performing mathematical modeling using time series was finally chosen. The virtual infrastructure necessary for the creation of the dataset is formed by five virtual machines, a MQTT broker and four client nodes, each of them with four sensors of the refrigeration units connected to the internal IoT network. DAD presents a seven day network activity with three types of anomalies: duplication, interception and modification on the MQTT message, spread over 5 days. Finally, a feature description is performed, so it can be used for the application of the various techniques of prediction or automatic classification.

[1]  Nerijus Paulauskas,et al.  Analysis of data pre-processing influence on intrusion detection using NSL-KDD dataset , 2017, 2017 Open Conference of Electrical, Electronic and Information Sciences (eStream).

[2]  George Athanasopoulos,et al.  Forecasting: principles and practice , 2013 .

[3]  O. G. Clark,et al.  A statistical method for the analysis of nonlinear temperature time series from compost. , 2008, Bioresource technology.

[4]  Athanasios V. Vasilakos,et al.  LAM-CIoT: Lightweight authentication mechanism in cloud-based IoT environment , 2020, J. Netw. Comput. Appl..

[5]  Neil W. Bergmann,et al.  Time Series Data Analysis of Wireless Sensor Network Measurements of Temperature , 2017, Sensors.

[6]  Marina Thottan,et al.  Anomaly detection in IP networks , 2003, IEEE Trans. Signal Process..

[7]  Ali A. Ghorbani,et al.  Towards effective feature selection in machine learning-based botnet detection approaches , 2014, 2014 IEEE Conference on Communications and Network Security.

[8]  Nour Moustafa,et al.  UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set) , 2015, 2015 Military Communications and Information Systems Conference (MilCIS).

[9]  Marcia McNutt,et al.  Data sharing , 2016, Science.

[10]  Victor Carneiro,et al.  A Practical Application of a Dataset Analysis in an Intrusion Detection System , 2018, 2018 IEEE 17th International Symposium on Network Computing and Applications (NCA).

[11]  Ali A. Ghorbani,et al.  Detecting P2P botnets through network behavior analysis and machine learning , 2011, 2011 Ninth Annual International Conference on Privacy, Security and Trust.

[12]  Jill Slay,et al.  The evaluation of Network Anomaly Detection Systems: Statistical analysis of the UNSW-NB15 data set and the comparison with the KDD99 data set , 2016, Inf. Secur. J. A Glob. Perspect..

[13]  Shailesh Singh Panwar,et al.  Performance Analysis of NSL-KDD Dataset Using Classification Algorithms with Different Feature Selection Algorithms and Supervised Filter Discretization , 2020 .

[14]  John McHugh,et al.  Testing Intrusion detection systems: a critique of the 1998 and 1999 DARPA intrusion detection system evaluations as performed by Lincoln Laboratory , 2000, TSEC.

[15]  Andrew H. Kemp,et al.  Statistical analysis of wireless sensor network Gaussian range estimation errors , 2013, IET Wirel. Sens. Syst..

[16]  S. Revathi Detecting Denial of Service Attack Using Principal Component Analysis with Random Forest Classifier , 2014 .

[17]  Kim-Kwang Raymond Choo,et al.  An Ensemble Intrusion Detection Technique Based on Proposed Statistical Flow Features for Protecting Network Traffic of Internet of Things , 2019, IEEE Internet of Things Journal.

[18]  Slavko Gajin,et al.  Machine Learning based Network Anomaly Detection for IoT environments , 2018 .

[19]  Ali A. Ghorbani,et al.  Toward developing a systematic approach to generate benchmark datasets for intrusion detection , 2012, Comput. Secur..

[20]  Omar Alrawi,et al.  SoK: Security Evaluation of Home-Based IoT Deployments , 2019, 2019 IEEE Symposium on Security and Privacy (SP).

[21]  Ali A. Ghorbani,et al.  Botnet detection based on traffic behavior analysis and flow intervals , 2013, Comput. Secur..

[22]  Elena Sitnikova,et al.  Towards the Development of Realistic Botnet Dataset in the Internet of Things for Network Forensic Analytics: Bot-IoT Dataset , 2018, Future Gener. Comput. Syst..

[23]  Raj Jain,et al.  A Survey of Protocols and Standards for Internet of Things , 2017, ArXiv.

[24]  Irma J. Terpenning,et al.  STL : A Seasonal-Trend Decomposition Procedure Based on Loess , 1990 .

[25]  Jugal K. Kalita,et al.  Towards Generating Real-life Datasets for Network Intrusion Detection , 2015, Int. J. Netw. Secur..

[26]  A. Malathi,et al.  A Detailed Analysis on NSL-KDD Dataset Using Various Machine Learning Techniques for Intrusion Detection , 2013 .

[27]  Yuval Elovici,et al.  N-BaIoT—Network-Based Detection of IoT Botnet Attacks Using Deep Autoencoders , 2018, IEEE Pervasive Computing.

[28]  Yuval Elovici,et al.  Kitsune: An Ensemble of Autoencoders for Online Network Intrusion Detection , 2018, NDSS.

[29]  Ali A. Ghorbani,et al.  A detailed analysis of the KDD CUP 99 data set , 2009, 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications.

[30]  Athanasios V. Vasilakos,et al.  Security of the Internet of Things: perspectives and challenges , 2014, Wireless Networks.

[31]  Aiko Pras,et al.  Flow Monitoring Explained: From Packet Capture to Data Analysis With NetFlow and IPFIX , 2014, IEEE Communications Surveys & Tutorials.

[32]  Athanasios V. Vasilakos,et al.  Design and analysis of authenticated key agreement scheme in cloud-assisted cyber-physical systems , 2020, Future Gener. Comput. Syst..

[33]  Rodrigo Roman,et al.  Securing the Internet of Things , 2017, Smart Cards, Tokens, Security and Applications, 2nd Ed..

[34]  Saharon Rosset,et al.  KDD-cup 99: knowledge discovery in a charitable organization's donor database , 2000, SKDD.