Performance of Machine Learning-Based Multi-Model Voting Ensemble Methods for Network Threat Detection in Agriculture 4.0

The upcoming agricultural revolution, known as Agriculture 4.0, integrates cutting-edge Information and Communication Technologies in existing operations. Various cyber threats related to the aforementioned integration have attracted increasing interest from security researchers. Network traffic analysis and classification based on Machine Learning (ML) methodologies can play a vital role in tackling such threats. Towards this direction, this research work presents and evaluates different ML classifiers for network traffic classification, i.e., K-Nearest Neighbors (KNN), Support Vector Classification (SVC), Decision Tree (DT), Random Forest (RF) and Stochastic Gradient Descent (SGD), as well as a hard voting and a soft voting ensemble model of these classifiers. In the context of this research work, three variations of the NSL-KDD dataset were utilized, i.e., initial dataset, undersampled dataset and oversampled dataset. The performance of the individual ML algorithms was evaluated in all three dataset variations and was compared to the performance of the voting ensemble methods. In most cases, both the hard and the soft voting models were found to perform better in terms of accuracy compared to the individual models.

[1]  Juan José Rodríguez Diez,et al.  A weighted voting framework for classifiers ensembles , 2012, Knowledge and Information Systems.

[2]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[3]  Ling Ling Wang,et al.  Research on Key Techniques for Monitoring System of Agricultural Products Transportation Environment Based on Internet of Things , 2012 .

[4]  Izzat Alsmadi,et al.  Machine learning approaches to IoT security: A systematic literature review , 2021, Internet Things.

[5]  Ahmad B. A. Hassanat,et al.  Effects of Distance Measure Choice on K-Nearest Neighbor Classifier Performance: A Review , 2019, Big Data.

[6]  M. A. Jabbar,et al.  Random Forest Modeling for Network Intrusion Detection System , 2016 .

[7]  Eduardo Rocha,et al.  A Survey of Payload-Based Traffic Classification Approaches , 2014, IEEE Communications Surveys & Tutorials.

[8]  Javier López,et al.  Trust, Privacy and Security in E-Business: Requirements and Solutions , 2005, Panhellenic Conference on Informatics.

[9]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[10]  Luis Hernández-Callejo,et al.  Ensemble network traffic classification: Algorithm comparison and novel ensemble scheme proposal , 2017, Comput. Networks.

[11]  M. Maloof Learning When Data Sets are Imbalanced and When Costs are Unequal and Unknown , 2003 .

[12]  Dohyeun Kim,et al.  An Ensemble of Prediction and Learning Mechanism for Improving Accuracy of Anomaly Detection in Network Intrusion Environments , 2021, Sustainability.

[13]  Jing Cai,et al.  A Weighted Voting Classifier Based on Differential Evolution , 2014 .

[14]  Richard E. Overill,et al.  Network traffic classification techniques and challenges , 2015, 2015 Tenth International Conference on Digital Information Management (ICDIM).

[15]  Sangarapillai Lambotharan,et al.  Support Vector Machine for Network Intrusion and Cyber-Attack Detection , 2017, 2017 Sensor Signal Processing for Defence Conference (SSPD).

[16]  Zhuo Lu,et al.  Cyber security in the Smart Grid: Survey and challenges , 2013, Comput. Networks.

[17]  Weikuan Jia,et al.  Semi-Supervised Transformation and Deep Embedding-Based Anomaly Identification for Agricultural Internet of Things , 2021, IEEE Sensors Journal.

[18]  A. Karlov CYBERSECURITY OF INTERNET OF THINGS – RISKS AND OPPORTUNITIES , 2017 .

[19]  Gilles Louppe,et al.  Independent consultant , 2013 .

[20]  Paramasivam Ilango,et al.  The Impact of Wireless Sensor Network in the Field of Precision Agriculture: A Review , 2017 .

[21]  Qusay H. Mahmoud,et al.  A Two-Level Flow-Based Anomalous Activity Detection System for IoT Networks , 2020, Electronics.

[22]  N. Japkowicz Learning from Imbalanced Data Sets: A Comparison of Various Strategies * , 2000 .

[23]  Zhen Liu,et al.  An Adaptive Ensemble Machine Learning Model for Intrusion Detection , 2019, IEEE Access.

[24]  Yue Wu,et al.  A New Intrusion Detection System Based on KNN Classification Algorithm in Wireless Sensor Network , 2014, J. Electr. Comput. Eng..

[25]  Smitha Rajagopal,et al.  A Stacking Ensemble for Network Intrusion Detection Using Heterogeneous Datasets , 2020, Secur. Commun. Networks.

[26]  Jian Kuang,et al.  Pharmaceutical Supply Chain Management System with Integration of IoT and Blockchain Technology , 2019, SmartBlock.

[27]  Mohamed Deriche,et al.  Classifiers Combination Techniques: A Comprehensive Review , 2018, IEEE Access.

[28]  Uday B. Desai,et al.  A low power IoT network for smart agriculture , 2018, 2018 IEEE 4th World Forum on Internet of Things (WF-IoT).

[29]  Mostafa Ezziyyani,et al.  Building A Fast Intrusion Detection System For High-Speed-Networks: Probe and DoS Attacks Detection , 2018 .

[30]  Cole Ehmke,et al.  The Market for E-Commerce Services in Agriculture , 2001 .

[31]  Lisandro Zambenedetti Granville,et al.  Improved Network Traffic Classification Using Ensemble Learning , 2019, 2019 IEEE Symposium on Computers and Communications (ISCC).

[32]  Victoria Beltran,et al.  Decision support systems for agriculture 4.0: Survey and challenges , 2020, Comput. Electron. Agric..

[33]  Konstantinos Demestichas,et al.  Survey on Security Threats in Agricultural IoT and Smart Farming , 2020, Sensors.

[34]  Qusay H. Mahmoud,et al.  Design and Development of a Deep Learning-Based Model for Anomaly Detection in IoT Networks , 2021, IEEE Access.

[35]  Joseph G. Tront,et al.  E-commerce security issues , 2002, Proceedings of the 35th Annual Hawaii International Conference on System Sciences.

[36]  Saied Mostaghimi,et al.  Cyberbiosecurity: A New Perspective on Protecting U.S. Food and Agricultural System , 2019, Front. Bioeng. Biotechnol..

[37]  Nitesh V. Chawla,et al.  Editorial: special issue on learning from imbalanced data sets , 2004, SKDD.

[38]  Xiaohong Yuan,et al.  Anomaly Detection on IoT Network Intrusion Using Machine Learning , 2020, 2020 International Conference on Artificial Intelligence, Big Data, Computing and Data Communication Systems (icABCD).

[39]  Jong Hyuk Park,et al.  DTB-IDS: an intrusion detection system based on decision tree using behavior analysis for preventing APT attacks , 2015, The Journal of Supercomputing.

[40]  Kim-Kwang Raymond Choo,et al.  Deep Learning-Based Intrusion Detection for Distributed Denial of Service Attack in Agriculture 4.0 , 2021, Electronics.

[41]  Abdulrahman Alruban,et al.  IoT Malware Network Traffic Classification using Visual Representation and Deep Learning , 2020, 2020 6th IEEE Conference on Network Softwarization (NetSoft).

[42]  Maxime Labonne,et al.  Anomaly-based network intrusion detection using machine learning. (Détection d'intrusion réseau par anomalies avec apprentissage automatique) , 2020 .

[43]  Fernando Nogueira,et al.  Imbalanced-learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning , 2016, J. Mach. Learn. Res..

[44]  Nitesh V. Chawla,et al.  C4.5 and Imbalanced Data sets: Investigating the eect of sampling method, probabilistic estimate, and decision tree structure , 2003 .

[45]  Mrinal Pandey,et al.  Hybrid Ensemble of classifiers using voting , 2015, 2015 International Conference on Green Computing and Internet of Things (ICGCIoT).

[46]  Kibet Langat,et al.  Cyber security challenges for IoT-based smart grid networks , 2019, Int. J. Crit. Infrastructure Prot..

[47]  Muhammad Saqlain,et al.  A Voting Ensemble Classifier for Wafer Map Defect Patterns Identification in Semiconductor Manufacturing , 2019, IEEE Transactions on Semiconductor Manufacturing.

[48]  M. N. Sulaiman,et al.  A Review On Evaluation Metrics For Data Classification Evaluations , 2015 .