Using Autoencoders for Anomaly Detection and Transfer Learning in IoT

With the development of Internet of Things (IoT) technologies, more and more smart devices are connected to the Internet. Since these devices were designed for better connections with each other, very limited security mechanisms have been considered. It would be costly to develop separate security mechanisms for the diverse behaviors in different devices. Given new and changing devices and attacks, it would be helpful if the characteristics of diverse device types could be dynamically learned for better protection. In this paper, we propose a machine learning approach to device type identification through network traffic analysis for anomaly detection in IoT. Firstly, the characteristics of different device types are learned from their generated network packets using supervised learning methods. Secondly, by learning important features from selected device types, we further compare the effects of unsupervised learning methods including One-class SVM, Isolation forest, and autoencoders for dimensionality reduction. Finally, we evaluate the performance of anomaly detection by transfer learning with autoencoders. In our experiments on real data in the target factory, the best performance of device type identification can be achieved by XGBoost with an accuracy of 97.6%. When adopting autoencoders for learning features from the network packets in Modbus TCP protocol, the best F1 score of 98.36% can be achieved. Comparable performance of anomaly detection can be achieved when using autoencoders for transfer learning from the reference dataset in the literature to our target site. This shows the potential of the proposed approach for automatic anomaly detection in smart factories. Further investigation is needed to verify the proposed approach using different types of devices in different IoT environments.

[1]  Hans D. Schotten,et al.  Anomaly-based Intrusion Detection in Industrial Data with SVM and Random Forests , 2019, 2019 International Conference on Software, Telecommunications and Computer Networks (SoftCOM).

[2]  Giancarlo Fortino,et al.  Supervised Feature Selection Techniques in Network Intrusion Detection: a Critical Review , 2021, Eng. Appl. Artif. Intell..

[3]  Bernhard Schölkopf,et al.  Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.

[4]  Lav Gupta,et al.  Machine Learning-Based Network Vulnerability Analysis of Industrial Internet of Things , 2019, IEEE Internet of Things Journal.

[5]  Charu C. Aggarwal,et al.  Theoretical Foundations and Algorithms for Outlier Ensembles , 2015, SKDD.

[6]  Naveen K. Chilamkurti,et al.  Distributed attack detection scheme using deep learning approach for Internet of Things , 2017, Future Gener. Comput. Syst..

[7]  Abdallah Shami,et al.  Multi-Stage Optimized Machine Learning Framework for Network Intrusion Detection , 2020, IEEE Transactions on Network and Service Management.

[8]  Joohwa Lee,et al.  Network Intrusion Detection System using Feature Extraction based on Deep Sparse Autoencoder , 2020, 2020 International Conference on Information and Communication Technology Convergence (ICTC).

[9]  M. M. A. Hashem,et al.  Attack and anomaly detection in IoT sensors in IoT sites using machine learning approaches , 2019, Internet Things.

[10]  Elena Sitnikova,et al.  Towards the Development of Realistic Botnet Dataset in the Internet of Things for Network Forensic Analytics: Bot-IoT Dataset , 2018, Future Gener. Comput. Syst..

[11]  Hans D. Schotten,et al.  Evaluation of Machine Learning-based Anomaly Detection Algorithms on an Industrial Modbus/TCP Data Set , 2018, ARES.

[12]  Ali Alqazzaz,et al.  AD-IoT: Anomaly Detection of IoT Cyberattacks in Smart City Using Machine Learning , 2019, 2019 IEEE 9th Annual Computing and Communication Workshop and Conference (CCWC).

[13]  Takehisa Yairi,et al.  Anomaly Detection Using Autoencoders with Nonlinear Dimensionality Reduction , 2014, MLSDA'14.

[14]  Xiang Li,et al.  Evaluating Feature Selection and Anomaly Detection Methods of Hard Drive Failure Prediction , 2021, IEEE Transactions on Reliability.

[15]  Tony Q. S. Quek,et al.  Adaptive Anomaly Detection for IoT Data in Hierarchical Edge Computing , 2020, ArXiv.

[16]  Zhi-Hua Zhou,et al.  Isolation Forest , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[17]  Tri Wanda Septian,et al.  Combining Oversampling with Recurrent Neural Networks for Intrusion Detection , 2021, DASFAA.

[18]  Quan Wang,et al.  A PUF-based unified identity verification framework for secure IoT hardware via device authentication , 2019, World Wide Web.