A Survey of Random Forest Based Methods for Intrusion Detection Systems

Over the past decades, researchers have been proposing different Intrusion Detection approaches to deal with the increasing number and complexity of threats for computer systems. In this context, Random Forest models have been providing a notable performance on their applications in the realm of the behaviour-based Intrusion Detection Systems. Specificities of the Random Forest model are used to provide classification, feature selection, and proximity metrics. This work provides a comprehensive review of the general basic concepts related to Intrusion Detection Systems, including taxonomies, attacks, data collection, modelling, evaluation metrics, and commonly used methods. It also provides a survey of Random Forest based methods applied in this context, considering the particularities involved in these models. Finally, some open questions and challenges are posed combined with possible directions to deal with them, which may guide future works on the area.

[1]  David Watson,et al.  Web App Attacks: Web application attacks , 2007 .

[2]  David C. Yen,et al.  A Network Behavior-Based Botnet Detection Mechanism Using PSO and K-means , 2015, TMIS.

[3]  Mohammad Zulkernine,et al.  Anomaly Based Network Intrusion Detection with Unsupervised Outlier Detection , 2006, 2006 IEEE International Conference on Communications.

[4]  Mohsen Guizani,et al.  Security in wireless mobile ad hoc and sensor networks [Guest Editorial] , 2007, IEEE Wireless Communications.

[5]  Georgios Kambourakis,et al.  Intrusion Detection in 802.11 Networks: Empirical Evaluation of Threats and a Public Dataset , 2016, IEEE Communications Surveys & Tutorials.

[6]  Govind P. Gupta,et al.  A Framework for Fast and Efficient Cyber Security Network Intrusion Detection Using Apache Spark , 2016 .

[7]  Benoit Claise,et al.  Internet Engineering Task Force (ietf) Flow Aggregation for the Ip Flow Information Export (ipfix) Protocol , 2022 .

[8]  P. Amudha,et al.  Performance Analysis of Data Mining Approaches in Intrusion Detection , 2011, 2011 International Conference on Process Automation, Control and Computing.

[9]  Carl K. Chang,et al.  Bayesian Model Averaging of Bayesian Network Classifiers for Intrusion Detection , 2014, 2014 IEEE 38th International Computer Software and Applications Conference Workshops.

[10]  T. Subbulakshmi,et al.  Multiple learning based classifiers using layered approach and Feature Selection for attack detection , 2013, 2013 IEEE International Conference ON Emerging Trends in Computing, Communication and Nanotechnology (ICECCN).

[11]  Dong Seong Kim,et al.  A Hybrid Approach for Real-Time Network Intrusion Detection Systems , 2007 .

[12]  H. S. Hota,et al.  Data Mining Approach for Developing Various Models Based on Types of Attack and Feature Selection as Intrusion Detection Systems (IDS) , 2013, ICACNI.

[13]  D. Lalitha Bhaskari,et al.  Intrusion Detection Using Random Forests Classifier with SMOTE and Feature Reduction , 2013, 2013 International Conference on Cloud & Ubiquitous Computing & Emerging Technologies.

[14]  Jing Ma,et al.  Network backbone anomaly detection using double random forests based on non-extensive entropy feature extraction , 2013, 2013 Ninth International Conference on Natural Computation (ICNC).

[15]  Sunil Kumar,et al.  Intrusion detection in mobile ad hoc networks: techniques, systems, and future challenges , 2016, Secur. Commun. Networks.

[16]  Gonzalo Álvarez,et al.  A new taxonomy of Web attacks suitable for efficient encoding , 2003, Comput. Secur..

[17]  Jacinth Salome,et al.  Fuzzy Data Mining and Genetic Algorithms Applied to Intrusion Detection , 2007 .

[18]  Andrew H. Sung,et al.  Intrusion detection using neural networks and support vector machines , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).

[19]  Kotagiri Ramamohanarao,et al.  Survey of network-based defense mechanisms countering the DoS and DDoS problems , 2007, CSUR.

[20]  Minho Park,et al.  A comparison of clustering algorithms for botnet detection based on network flow , 2016, 2016 Eighth International Conference on Ubiquitous and Future Networks (ICUFN).

[21]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[22]  J. Morgan,et al.  Problems in the Analysis of Survey Data, and a Proposal , 1963 .

[23]  Peter Reiher,et al.  A taxonomy of DDoS attack and DDoS defense mechanisms , 2004, CCRV.

[24]  Chun-Hung Richard Lin,et al.  Intrusion detection system: A comprehensive review , 2013, J. Netw. Comput. Appl..

[25]  Andi Wahju Rahardjo Emanuel,et al.  Performance Evaluation of Supervised Machine Learning Algorithms Using Different Data Set Sizes for Diabetes Prediction , 2019, 2019 5th International Conference on Science in Information Technology (ICSITech).

[26]  Ajith Abraham,et al.  Ensemble of One-Class Classifiers for Network Intrusion Detection System , 2008, 2008 The Fourth International Conference on Information Assurance and Security.

[27]  Juan E. Tapiador,et al.  Anomaly detection methods in wired networks: a survey and taxonomy , 2004, Comput. Commun..

[28]  Bayu Adhi Tama,et al.  A Combination of PSO-Based Feature Selection and Tree-Based Classifiers Ensemble for Intrusion Detection Systems , 2015, CSA/CUTE.

[29]  Ray Hunt,et al.  A taxonomy of network and computer attacks , 2005, Comput. Secur..

[30]  Yali Amit,et al.  Shape Quantization and Recognition with Randomized Trees , 1997, Neural Computation.

[31]  Aiko Pras,et al.  Flow Monitoring Explained: From Packet Capture to Data Analysis With NetFlow and IPFIX , 2014, IEEE Communications Surveys & Tutorials.

[32]  Yong Zhang,et al.  Integration of heterogeneous classifiers for intrusion detection , 2010, 2010 3rd International Conference on Advanced Computer Theory and Engineering(ICACTE).

[33]  Wei-Yang Lin,et al.  Intrusion detection by machine learning: A review , 2009, Expert Syst. Appl..

[34]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[35]  Pietro Sabatino,et al.  Ensemble based collaborative and distributed intrusion detection systems: A survey , 2016, J. Netw. Comput. Appl..

[36]  Manas Ranjan Patra,et al.  A Hybrid Intelligent Approach for Network Intrusion Detection , 2012 .

[37]  W. Loh,et al.  SPLIT SELECTION METHODS FOR CLASSIFICATION TREES , 1997 .

[38]  Zheni Stefanova,et al.  Network attribute selection, classification and accuracy (NASCA) procedure for intrusion detection systems , 2017, 2017 IEEE International Symposium on Technologies for Homeland Security (HST).

[39]  Ali A. Ghorbani,et al.  A detailed analysis of the KDD CUP 99 data set , 2009, 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications.

[40]  Vern Paxson,et al.  Outside the Closed World: On Using Machine Learning for Network Intrusion Detection , 2010, 2010 IEEE Symposium on Security and Privacy.

[41]  Erhan Guven,et al.  A Survey of Data Mining and Machine Learning Methods for Cyber Security Intrusion Detection , 2016, IEEE Communications Surveys & Tutorials.

[42]  Mohammad Zulkernine,et al.  Random-Forests-Based Network Intrusion Detection Systems , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[43]  Fan Yang,et al.  Exploring the stability of feature selection for imbalanced intrusion detection data , 2011, 2011 9th IEEE International Conference on Control and Automation (ICCA).

[44]  Ali A. Ghorbani,et al.  Network Intrusion Detection and Prevention - Concepts and Techniques , 2010, Advances in Information Security.

[45]  Balachandra Muniyal,et al.  Performance Evaluation of Supervised Machine Learning Algorithms for Intrusion Detection , 2016 .

[46]  Leyla Bilge,et al.  Disclosure: detecting botnet command and control servers through large-scale NetFlow analysis , 2012, ACSAC '12.

[47]  Ali A. Ghorbani,et al.  Comparative Study of Supervised Machine Learning Techniques for Intrusion Detection , 2007, Fifth Annual Conference on Communication Networks and Services Research (CNSR '07).

[48]  Ing-Ray Chen,et al.  A survey of intrusion detection techniques for cyber-physical systems , 2014, ACM Comput. Surv..

[49]  Shreya Dubey,et al.  KBB: A hybrid method for intrusion detection , 2015, 2015 International Conference on Computer, Communication and Control (IC4).

[50]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[51]  Chen Junjie,et al.  Application of Unbalanced Data Approach to Network Intrusion Detection , 2009, 2009 First International Workshop on Database Technology and Applications.

[52]  Menachem Domb,et al.  Lightweight adaptive Random-Forest for IoT rule generation and execution , 2017, J. Inf. Secur. Appl..

[53]  Farrukh Aslam Khan,et al.  A Hybrid Technique Using Multi-objective Particle Swarm Optimization and Random Forests for PROBE Attacks Detection in a Network , 2013, 2013 IEEE International Conference on Systems, Man, and Cybernetics.

[54]  Wolfgang Banzhaf,et al.  The use of computational intelligence in intrusion detection systems: A review , 2010, Appl. Soft Comput..

[55]  Vijay Varadharajan,et al.  Intrusion detection techniques in cloud environment: A survey , 2017, J. Netw. Comput. Appl..

[56]  Jun Zhang,et al.  Network traffic clustering using Random Forest proximities , 2013, 2013 IEEE International Conference on Communications (ICC).

[57]  R. R. Rejimol Robinson,et al.  Ranking of machine learning algorithms based on the performance in classifying DDoS attacks , 2015, 2015 IEEE Recent Advances in Intelligent Computational Systems (RAICS).

[58]  Alejandro Zunino,et al.  An empirical comparison of botnet detection methods , 2014, Comput. Secur..

[59]  Dong Hyun Jeong,et al.  A multi-level intrusion detection method for abnormal network behaviors , 2016, J. Netw. Comput. Appl..

[60]  Mamun Bin Ibne Reaz,et al.  A survey of intrusion detection systems based on ensemble and hybrid classifiers , 2017, Comput. Secur..

[61]  Larry A. Rendell,et al.  A Practical Approach to Feature Selection , 1992, ML.

[62]  Guofei Gu,et al.  A Taxonomy of Botnet Structures , 2007, ACSAC.

[63]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[64]  Dan Jiang,et al.  An Approach to Detect Remote Access Trojan in the Early Stage of Communication , 2015, 2015 IEEE 29th International Conference on Advanced Information Networking and Applications.

[65]  Zheng Wu,et al.  A Taxonomy of Network and Computer Attacks Based on Responses , 2011, 2011 International Conference of Information Technology, Computer Engineering and Management Sciences.

[66]  Antanas Verikas,et al.  Mining data with random forests: A survey and results of new tests , 2011, Pattern Recognit..

[67]  Muhammad Sher,et al.  Flow-based intrusion detection: Techniques and challenges , 2017, Comput. Secur..

[68]  Ali A. Ghorbani,et al.  Botnet detection based on traffic behavior analysis and flow intervals , 2013, Comput. Secur..

[69]  Sehun Kim,et al.  A Novel Hierarchical Detection Method for Enhancing Anomaly Detection Efficiency , 2015, 2015 International Conference on Computational Intelligence and Communication Networks (CICN).

[70]  Ronald D. Williams,et al.  Taxonomies of attacks and vulnerabilities in computer systems , 2008, IEEE Communications Surveys & Tutorials.

[71]  Yang Xiao,et al.  Intrusion detection techniques in mobile ad hoc and wireless sensor networks , 2007, IEEE Wireless Communications.

[72]  Mohammad Zulkernine,et al.  Network Intrusion Detection using Random Forests , 2005, PST.

[73]  Wenke Lee,et al.  Intrusion Detection Techniques for Mobile Wireless Networks , 2003, Wirel. Networks.

[74]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[75]  Gabriel Maciá-Fernández,et al.  Anomaly-based network intrusion detection: Techniques, systems and challenges , 2009, Comput. Secur..

[76]  A. F. Adams,et al.  The Survey , 2021, Dyslexia in Higher Education.

[77]  Mahesh Chandra Govil,et al.  A comparative analysis of SVM and its stacking with other classification algorithm for intrusion detection , 2016, 2016 International Conference on Advances in Computing, Communication, & Automation (ICACCA) (Spring).

[78]  Levente Buttyán,et al.  Embedded systems security: Threats, vulnerabilities, and attack taxonomy , 2015, 2015 13th Annual Conference on Privacy, Security and Trust (PST).

[79]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[80]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[81]  A. Nur Zincir-Heywood,et al.  Benchmarking the Effect of Flow Exporters and Protocol Filters on Botnet Traffic Classification , 2016, IEEE Systems Journal.

[82]  G. Manimaran,et al.  Internet infrastructure security: a taxonomy , 2002, IEEE Netw..

[83]  Md. Al Mehedi Hasan,et al.  Feature Selection for Intrusion Detection Using Random Forest , 2016 .

[84]  Jose Romero-Mariona,et al.  IoDDoS - The Internet of Distributed Denial of Sevice Attacks - A Case Study of the Mirai Malware and IoT-Based Botnets , 2017, IoTBDS.

[85]  Aiko Pras,et al.  A Labeled Data Set for Flow-Based Intrusion Detection , 2009, IPOM.

[86]  Vipin Kumar,et al.  A Comparative Study of Classification Techniques for Intrusion Detection , 2013, 2013 International Symposium on Computational and Business Intelligence.

[87]  Xiaofeng Qiu,et al.  P2P attack taxonomy and relationship analysis , 2009, 2009 11th International Conference on Advanced Communication Technology.

[88]  Elisa Bertino,et al.  Botnets and Internet of Things Security , 2017, Computer.

[89]  Mark A. Hall,et al.  Correlation-based Feature Selection for Machine Learning , 2003 .

[90]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[91]  Marc Dacier,et al.  Intrusion detection , 1999, Comput. Networks.

[92]  Shawn Ostermann,et al.  Detecting Anomalous Network Traffic with Self-organizing Maps , 2003, RAID.

[93]  Dong Seong Kim,et al.  Quantitative Intrusion Intensity Assessment Using Important Feature Selection and Proximity Metrics , 2009, 2009 15th IEEE Pacific Rim International Symposium on Dependable Computing.

[94]  S Nageswari,et al.  Comparison of Classification Techniques on Data Mining , 2019 .

[95]  Benoit Claise,et al.  Information Model for IP Flow Information Export (IPFIX) , 2013, RFC.

[96]  N. Wattanapongsakorn,et al.  A new approach for internet worm detection and classification , 2010, INC2010: 6th International Conference on Networked Computing.

[97]  Chih-Fong Tsai,et al.  CANN: An intrusion detection system based on combining cluster centers and nearest neighbors , 2015, Knowl. Based Syst..

[98]  Kim-Kwang Raymond Choo,et al.  Cloud Attack and Risk Assessment Taxonomy , 2015, IEEE Cloud Computing.

[99]  Ahmed Serhrouchni,et al.  AIDD: A novel generic attack modeling approach , 2014, 2014 International Conference on High Performance Computing & Simulation (HPCS).

[100]  Ali A. Ghorbani,et al.  An Evaluation Framework for Intrusion Detection Dataset , 2016, 2016 International Conference on Information Science and Security (ICISS).

[101]  Sylvio Barbon Junior,et al.  Detecting mobile botnets through machine learning and system calls analysis , 2017, 2017 IEEE International Conference on Communications (ICC).

[102]  Ing-Ray Chen,et al.  A survey of intrusion detection in wireless network applications , 2014, Comput. Commun..

[103]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[104]  Nimmy Cleetus,et al.  Multi-objective functions in particle swarm optimization for intrusion detection , 2014, 2014 International Conference on Advances in Computing, Communications and Informatics (ICACCI).

[105]  George Karabatis,et al.  Contextual information fusion for intrusion detection: a survey and taxonomy , 2017, Knowledge and Information Systems.

[106]  Aboul Ella Hassanien,et al.  Comparison of classification techniques applied for network intrusion detection and classification , 2017, J. Appl. Log..

[107]  N. Srinivasan,et al.  Using Random Forests for Network-based Anomaly detection at Active routers , 2008, 2008 International Conference on Signal Processing, Communications and Networking.

[108]  Adel Sabry Eesa,et al.  A novel feature-selection approach based on the cuttlefish optimization algorithm for intrusion detection systems , 2015, Expert Syst. Appl..

[109]  Xiaoming Zhang,et al.  Hadoop-Based System Design for Website Intrusion Detection and Analysis , 2015, 2015 IEEE International Conference on Smart City/SocialCom/SustainCom (SmartCity).

[110]  Colin Gilmore,et al.  Anomaly Detection and Machine Learning Methods for Network Intrusion Detection : an Industrially Focused Literature Review , 2016 .

[111]  Anazida Zainal,et al.  Intrusion Detection Techniques in Cloud Computing: A Review , 2018 .

[112]  Grenville J. Armitage,et al.  A survey of techniques for internet traffic classification using machine learning , 2008, IEEE Communications Surveys & Tutorials.

[113]  Benoit Claise,et al.  Specification of the IP Flow Information Export (IPFIX) Protocol for the Exchange of Flow Information , 2013, RFC.

[114]  Truong Son Pham,et al.  Machine learning techniques for web intrusion detection — A comparison , 2016, 2016 Eighth International Conference on Knowledge and Systems Engineering (KSE).

[115]  Ian H. Witten,et al.  Generating Accurate Rule Sets Without Global Optimization , 1998, ICML.

[116]  W. Timothy Strayer,et al.  Using Machine Learning Techniques to Identify Botnet Traffic , 2006 .

[117]  Sean Carlisto de Alvarenga,et al.  A survey of intrusion detection in Internet of Things , 2017, J. Netw. Comput. Appl..

[118]  Taghi M. Khoshgoftaar,et al.  An Empirical Study of Learning from Imbalanced Data Using Random Forest , 2007 .

[119]  Saeed Sharifian,et al.  Modified parallel random forest for intrusion detection systems , 2016, The Journal of Supercomputing.

[120]  Ali A. Ghorbani,et al.  IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS 1 Toward Credible Evaluation of Anomaly-Based Intrusion-Detection Methods , 2022 .

[121]  Gisung Kim,et al.  A novel hybrid intrusion detection method integrating anomaly detection with misuse detection , 2014, Expert Syst. Appl..

[122]  M. A. Jabbar,et al.  Random Forest Modeling for Network Intrusion Detection System , 2016 .

[123]  Feng Pan,et al.  Anomaly detection based-on the regularity of normal behaviors , 2006, 2006 1st International Symposium on Systems and Control in Aerospace and Astronautics.

[124]  Farrukh Aslam Khan,et al.  Binary PSO and random forests algorithm for PROBE attacks detection in a network , 2011, 2011 IEEE Congress of Evolutionary Computation (CEC).

[125]  Aiko Pras,et al.  An Overview of IP Flow-Based Intrusion Detection , 2010, IEEE Communications Surveys & Tutorials.

[126]  Stefan Axelsson,et al.  Intrusion Detection Systems: A Survey and Taxonomy , 2002 .

[127]  Sharath Chandra Guntuku,et al.  Big Data Analytics framework for Peer-to-Peer Botnet detection using Random Forests , 2014, Inf. Sci..

[128]  Jugal K. Kalita,et al.  Network Anomaly Detection: Methods, Systems and Tools , 2014, IEEE Communications Surveys & Tutorials.

[129]  Mohammad Zulkernine,et al.  CloudZombie: Launching and Detecting Slow-Read Distributed Denial of Service Attacks from the Cloud , 2015, 2015 IEEE International Conference on Computer and Information Technology; Ubiquitous Computing and Communications; Dependable, Autonomic and Secure Computing; Pervasive Intelligence and Computing.

[130]  Lei Liu,et al.  Combining supervised and unsupervised learning for zero-day malware detection , 2013, 2013 Proceedings IEEE INFOCOM.

[131]  Rich Caruana,et al.  An empirical comparison of supervised learning algorithms , 2006, ICML.

[132]  Anirban Bhowal,et al.  Comparative analysis of machine learning algorithms along with classifiers for network intrusion detection , 2015, 2015 International Conference on Smart Technologies and Management for Computing, Communication, Controls, Energy and Materials (ICSTM).

[133]  Yu-Lin He,et al.  Fuzziness based semi-supervised learning approach for intrusion detection system , 2017, Inf. Sci..

[134]  Jon Atli Benediktsson,et al.  A Novel Feature Selection Approach Based on FODPSO and SVM , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[135]  Stefan Kramer,et al.  Ensembles of Balanced Nested Dichotomies for Multi-class Problems , 2005, PKDD.

[136]  Sujeet Shenoi,et al.  A Taxonomy of Attacks on the DNP3 Protocol , 2009, Critical Infrastructure Protection.

[137]  Ravi Sankar,et al.  A Survey of Intrusion Detection Systems in Wireless Sensor Networks , 2014, IEEE Communications Surveys & Tutorials.

[138]  Ali A. Ghorbani,et al.  Characterization of Tor Traffic using Time based Features , 2017, ICISSP.

[139]  Kim-Kwang Raymond Choo,et al.  User profiling in intrusion detection: A review , 2016, J. Netw. Comput. Appl..

[140]  Farrukh Aslam Khan,et al.  Network intrusion detection using hybrid binary PSO and random forests algorithm , 2015, Secur. Commun. Networks.

[141]  István Szabó,et al.  On the Validation of Traffic Classification Algorithms , 2008, PAM.

[142]  S. Shankar Sastry,et al.  A Taxonomy of Cyber Attacks on SCADA Systems , 2011, 2011 International Conference on Internet of Things and 4th International Conference on Cyber, Physical and Social Computing.

[143]  Steven McElwee,et al.  Active learning intrusion detection using k-means clustering selection , 2017, SoutheastCon 2017.

[144]  Jung-Min Park,et al.  An overview of anomaly detection techniques: Existing solutions and latest technological trends , 2007, Comput. Networks.

[145]  Dorothy E. Denning,et al.  An Intrusion-Detection Model , 1986, 1986 IEEE Symposium on Security and Privacy.

[146]  Wei Cong,et al.  Anomaly intrusion detection based on PLS feature extraction and core vector machine , 2013, Knowl. Based Syst..

[147]  Kwangjo Kim,et al.  Machine-Learning-Based Feature Selection Techniques for Large-Scale Network Intrusion Detection , 2014, 2014 IEEE 34th International Conference on Distributed Computing Systems Workshops (ICDCSW).

[148]  Mohammad Zulkernine,et al.  A hybrid network intrusion detection technique using random forests , 2006, First International Conference on Availability, Reliability and Security (ARES'06).

[149]  Xin-guang Tian,et al.  A Method for Anomaly Detection of User Behaviors Based on Machine Learning , 2006 .