Intrusion Detection using Data Mining: A contemporary comparative study

Intrusion detection systems play a crucial rule in this era where networks reached almost any sector. Unfortunately, intrusion detection systems are far from perfectness. Therefore, researchers never stopped digging deeper to improve them. In this context, data mining techniques have been highly exploited for intrusion detection. In this paper, we present a comparative study of data mining techniques for intrusion detection. Specifically, we study the overall performances of those methods as well as the impact of training data size on their results. We use ISCX2012 as a benchmark for our experimentation. A realistic dataset that represents at a certain level today’s network traffic. The study shows that relatively old methods outperform some of the techniques highly used actually by the community. Regarding the impact of training dataset size, the investigated methods react differently from each other when we add more data to the training dataset. In addition, the results highlight the importance of attack traffic in the training dataset. Moreover, they strongly suggest the use of Random Forest for intrusion detection due to its linear performance relation with the training dataset’s size.

[1]  Sherif Saad Ahmed,et al.  Intrusion Alert Analysis Framework Using Semantic Correlation , 2014 .

[2]  Rachid Beghdad,et al.  Critical study of neural networks in detecting intrusions , 2008, Comput. Secur..

[3]  Ralph Langner,et al.  Stuxnet: Dissecting a Cyberwarfare Weapon , 2011, IEEE Security & Privacy.

[4]  B. Ripley,et al.  Recursive Partitioning and Regression Trees , 2015 .

[5]  Max Kuhn,et al.  caret: Classification and Regression Training , 2015 .

[6]  Kurt Hornik,et al.  kernlab - An S4 Package for Kernel Methods in R , 2004 .

[7]  Kim-Kwang Raymond Choo,et al.  User profiling in intrusion detection: A review , 2016, J. Netw. Comput. Appl..

[8]  Mao Lin Huang,et al.  Density approach: a new model for BigData analysis and visualization , 2016, Concurr. Comput. Pract. Exp..

[9]  Kurt Hornik,et al.  Misc Functions of the Department of Statistics, ProbabilityTheory Group (Formerly: E1071), TU Wien , 2015 .

[10]  Ahmed Ahmim,et al.  A new adaptive intrusion detection system based on the intersection of two different classifiers , 2014, Int. J. Secur. Networks.

[11]  Salvatore J. Stolfo,et al.  Model Aggregation for Distributed Content Anomaly Detection , 2014, AISec '14.

[12]  Vipin Kumar,et al.  A Comparative Study of Classification Techniques for Intrusion Detection , 2013, 2013 International Symposium on Computational and Business Intelligence.

[13]  Mao Lin Huang,et al.  Detecting Flood Attacks through New Density-Pattern Based Approach , 2013, 2013 IEEE 10th International Conference on High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing.

[14]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[15]  Peter Mell,et al.  NIST Special Publication on Intrusion Detection Systems , 2001 .

[16]  Xianghan Zheng,et al.  An efficient cascaded method for network intrusion detection based on extreme learning machines , 2016, The Journal of Supercomputing.

[17]  Jiankun Hu,et al.  A novel statistical technique for intrusion detection systems , 2018, Future Gener. Comput. Syst..

[18]  Phiwa Mzila,et al.  The effect of destination linked feature selection in real-time network intrusion detection , 2013 .

[19]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[20]  Ahmed Ahmim,et al.  A new hierarchical intrusion detection system based on a binary tree of classifiers , 2015, Inf. Comput. Secur..

[21]  Philip K. Chan,et al.  An Analysis of the 1999 DARPA/Lincoln Laboratory Evaluation Data for Network Anomaly Detection , 2003, RAID.

[22]  Lin Li,et al.  Intrusion detection algorithm based on OCSVM in industrial control system , 2016, Secur. Commun. Networks.

[23]  Ali A. Ghorbani,et al.  A detailed analysis of the KDD CUP 99 data set , 2009, 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications.

[24]  Ali A. Ghorbani,et al.  Toward developing a systematic approach to generate benchmark datasets for intrusion detection , 2012, Comput. Secur..

[25]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[26]  Wolfgang Banzhaf,et al.  The use of computational intelligence in intrusion detection systems: A review , 2010, Appl. Soft Comput..

[27]  Natasa Erjavec Dummy Variables , 2011, International Encyclopedia of Statistical Science.

[28]  Xiangjian He,et al.  Detection of Denial-of-Service Attacks Based on Computer Vision Techniques , 2015, IEEE Transactions on Computers.

[29]  Xiangliang Zhang,et al.  Abstracting massive data for lightweight intrusion detection in computer networks , 2016, Inf. Sci..

[30]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[31]  Nur Izura Udzir,et al.  Anomaly-based intrusion detection through K-means clustering and naives bayes classification , 2013 .

[32]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[33]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[34]  Harish Kumar,et al.  An intrusion detection system using network traffic profiling and online sequential extreme learning machine , 2015, Expert Syst. Appl..