A Survey and Taxonomy on Data and Pre-processing Techniques of Intrusion Detection Systems

In this chapter, a new review and taxonomy of the input data and pre-processing techniques of intrusion detection systems are presented. This chapter surveys the literature over the last two decades on the data of intrusion detection systems. We present also in this chapter a framework for understanding the different components described in the literature that allows readers to systematically understand the works and envision future hybrid approaches. The chapter describes how to collect the data, and how to prepare this data for different types of processing. We opted to organize the chapter along a component-by-component structure, rather than a paper-by-paper organization, since we believe this will give the reader a wider perspective about the process of constructing an intrusion detection system and its evaluation mechanisms. The organization of this chapter represents an ideal intrusion detection system since it contains most of the components of IDS, so existing approaches can be neatly accommodated within this framework. This will allow the reader to construct and explore new systems by assembling the described components in novel arrangements. We have also conducted important comparisons after each component of IDS supported by some tables to give the reader a better perspective about that particular component. In this sense, it provides insights that a reader would not gain by simply reading the original source papers. The classifiers used with IDS are beyond the scope of this chapter.

[1]  Haibin Zhu,et al.  A Collaborative and Adaptive Intrusion Detection Based on SVMs and Decision Trees , 2014, 2014 IEEE International Conference on Data Mining Workshop.

[2]  Bin-Xing Fang,et al.  A Lightweight Intrusion Detection Model Based on Feature Selection and Maximum Entropy Model , 2006, 2006 International Conference on Communication Technology.

[3]  Marc Dacier,et al.  Lessons learned from the deployment of a high-interaction honeypot , 2006, 2006 Sixth European Dependable Computing Conference.

[4]  Wang Jing,et al.  Realization of intrusion detection system based on the improved data mining technology , 2013, 2013 8th International Conference on Computer Science & Education.

[5]  Chih-Fong Tsai,et al.  CANN: An intrusion detection system based on combining cluster centers and nearest neighbors , 2015, Knowl. Based Syst..

[6]  Carsten Willems,et al.  A Malware Instruction Set for Behavior-Based Analysis , 2010, Sicherheit.

[7]  Wang Chunlei,et al.  A framework for network security situation awareness based on knowledge discovery , 2010, 2010 2nd International Conference on Computer Engineering and Technology.

[8]  Nur Izura Udzir,et al.  Signature-Based Anomaly intrusion detection using Integrated data mining classifiers , 2014, 2014 International Symposium on Biometrics and Security Technologies (ISBAST).

[9]  Ravindra C. Thool,et al.  Intrusion Detection System Using Bagging Ensemble Method of Machine Learning , 2015, 2015 International Conference on Computing Communication Control and Automation.

[10]  A. Nur Zincir-Heywood,et al.  On the analysis of backscatter traffic , 2014, 39th Annual IEEE Conference on Local Computer Networks Workshops.

[11]  Sanjay Silakari,et al.  A Survey of Cyber Attack Detection Systems , 2009 .

[12]  Omar Al-Jarrah,et al.  Network Intrusion Detection System Using Neural Network Classification of Attack Behavior , 2015 .

[13]  Hang See Ong,et al.  Analysis of the Effect of Clustering the Training Data in Naive Bayes Classifier for Anomaly Network Intrusion Detection , 2014 .

[14]  Ian Welch,et al.  Application of divide-and-conquer algorithm paradigm to improve the detection speed of high interaction client honeypots , 2008, SAC '08.

[15]  Yan Zhang,et al.  The Design and Implementation of Host-Based Intrusion Detection System , 2010, 2010 Third International Symposium on Intelligent Information Technology and Security Informatics.

[16]  Stephen Northcutt,et al.  Network intrusion detection , 2003 .

[17]  Muhammad Junaid Muzammil,et al.  Comparative analysis of classification algorithms performance for statistical based intrusion detection system , 2013, 2013 3rd IEEE International Conference on Computer, Control and Communication (IC4).

[18]  R. Sekar,et al.  Specification-based anomaly detection: a new approach for detecting network intrusions , 2002, CCS '02.

[19]  Ilango Krishnamurthi,et al.  Modified DSR protocol for detection and removal of selective black hole attack in MANET , 2014, Comput. Electr. Eng..

[20]  Ajay Gupta,et al.  Anomaly intrusion detection in wireless sensor networks , 2006, J. High Speed Networks.

[21]  Seong-je Cho,et al.  An efficient visitation algorithm to improve the detection speed of high-interaction client honeypots , 2011, RACS.

[22]  Shahram Jamali,et al.  Defense against SYN flooding attacks: A particle swarm optimization approach , 2014, Comput. Electr. Eng..

[23]  Shankar M. Banik,et al.  Applying Data Mining Techniques to Intrusion Detection , 2015, 2015 12th International Conference on Information Technology - New Generations.

[24]  S J Ghule,et al.  Network Intrusion Detection System using Fuzzy Logic , 2014 .

[25]  Ali A. Ghorbani,et al.  A detailed analysis of the KDD CUP 99 data set , 2009, 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications.

[26]  Vineet Richharya,et al.  Design of Trust Model For Efficient Cyber Attack Detection on Fuzzified Large Data using Data Mining techniques , 2013 .

[27]  Philippe Owezarski,et al.  Unsupervised Network Intrusion Detection Systems: Detecting the Unknown without Knowledge , 2012, Comput. Commun..

[28]  A. Karr,et al.  Computer Intrusion: Detecting Masquerades , 2001 .

[29]  Claudia Picardi,et al.  Identity verification through dynamic keystroke analysis , 2003, Intell. Data Anal..

[30]  V. Sharma,et al.  Innovative Genetic Approach for Intrusion Detection by Using Decision Tree , 2013, 2013 International Conference on Communication Systems and Network Technologies.

[31]  Michael Ligh,et al.  Malware Analyst's Cookbook and DVD: Tools and Techniques for Fighting Malicious Code , 2010 .

[32]  Preeti Singh,et al.  Threat prediction using honeypot and machine learning , 2015, 2015 International Conference on Futuristic Trends on Computational Analysis and Knowledge Management (ABLAZE).

[33]  Xiangjian He,et al.  Detection of Denial-of-Service Attacks Based on Computer Vision Techniques , 2015, IEEE Transactions on Computers.

[34]  Boris Nechaev,et al.  A Preliminary Analysis of TCP Performance in an Enterprise Network , 2010, INM/WREN.

[35]  Helen Ashman,et al.  Anomaly Detection over User Profiles for Intrusion Detection , 2010 .

[36]  Gary McGraw,et al.  Attacking Malicious Code: A Report to the Infosec Research Council , 2000, IEEE Software.

[37]  A. Nur Zincir-Heywood,et al.  On Evaluating IP Traceback Schemes: A Practical Perspective , 2013, 2013 IEEE Security and Privacy Workshops.

[38]  Julie Greensmith,et al.  Immune system approaches to intrusion detection – a review , 2004, Natural Computing.

[39]  Vishwas Sharma,et al.  Usefulness of DARPA dataset for intrusion detection system evaluation , 2008, SPIE Defense + Commercial Sensing.

[40]  Huan Liu,et al.  Toward integrating feature selection algorithms for classification and clustering , 2005, IEEE Transactions on Knowledge and Data Engineering.

[41]  Felix C. Freiling,et al.  The Nepenthes Platform: An Efficient Approach to Collect Malware , 2006, RAID.

[42]  A. Nur Zincir-Heywood,et al.  Analysis of Three Intrusion Detection System Benchmark Datasets Using Machine Learning Algorithms , 2005, ISI.

[43]  Konrad Rieck,et al.  Botzilla: detecting the "phoning home" of malicious software , 2010, SAC '10.

[44]  Liang Xu,et al.  Design and implementation of intrusion detection based on mobile agents , 2008, 2008 IEEE International Symposium on IT in Medicine and Education.

[45]  Herbert Bos,et al.  SweetBait: Zero-hour worm detection and containment using low- and high-interaction honeypots , 2007, Comput. Networks.

[46]  Guan Xiaoqing,et al.  Network intrusion detection method based on Agent and SVM , 2010, 2010 2nd IEEE International Conference on Information Management and Engineering.

[47]  Ambarish Jadhav,et al.  A novel approach for the design of network intrusion detection system(NIDS) , 2013, PROCEEDINGS OF 2013 International Conference on Sensor Network Security Technology and Privacy Communication System.

[48]  Hidema Tanaka,et al.  Intrusion detection system using Discrete Fourier Transform , 2014, the 2014 Seventh IEEE Symposium on Computational Intelligence for Security and Defense Applications (CISDA).

[49]  Terran Lane,et al.  A Decision-Theoritic, Semi-Supervised Model for Intrusion Detection , 2006 .

[50]  Andreas Koch,et al.  Malacoda: Towards high-level compilation of network security applications on reconfigurable hardware , 2012, 2012 ACM/IEEE Symposium on Architectures for Networking and Communications Systems (ANCS).

[51]  C. A. Kumar,et al.  An analysis of supervised tree based classifiers for intrusion detection system , 2013, 2013 International Conference on Pattern Recognition, Informatics and Mobile Engineering.

[52]  Anamika Yadav,et al.  Performance analysis of NSL-KDD dataset using ANN , 2015, 2015 International Conference on Signal Processing and Communication Engineering Systems.

[53]  Barak A. Pearlmutter,et al.  Detecting intrusions using system calls: alternative data models , 1999, Proceedings of the 1999 IEEE Symposium on Security and Privacy (Cat. No.99CB36344).

[54]  Jaydip Sen Efficient Routing Anomaly Detection in Wireless Mesh Networks , 2010, 2010 First International Conference on Integrated Intelligent Computing.

[55]  Kotaro Hirasawa,et al.  Intrusion detection system combining misuse detection and anomaly detection using Genetic Network Programming , 2009, 2009 ICCAS-SICE.

[56]  Taeshik Shon,et al.  A hybrid machine learning approach to network anomaly detection , 2007, Inf. Sci..

[57]  Ingoo Han,et al.  The neural network models for IDS based on the asymmetric costs of false negative errors and false positive errors , 2003, Expert Syst. Appl..

[58]  Terran Lane,et al.  An Application of Machine Learning to Anomaly Detection , 1999 .

[59]  Xiangliang Zhang,et al.  Autonomic intrusion detection: Adaptively detecting anomalies over unlabeled audit data streams in computer networks , 2014, Knowl. Based Syst..

[60]  Ali A. Ghorbani,et al.  Toward developing a systematic approach to generate benchmark datasets for intrusion detection , 2012, Comput. Secur..

[61]  Philip K. Chan,et al.  PHAD: packet header anomaly detection for identifying hostile network traffic , 2001 .

[62]  Niels Provos,et al.  A Virtual Honeypot Framework , 2004, USENIX Security Symposium.