Performance analysis of network intrusion detection schemes using Apache Spark

Fast and efficient network intrusion detection is a very challenging issue as the size of network traffic has become increasingly big and complex. A real time intrusion detection system should be able to process large size of network traffic data as quickly as possible in order to prevent intrusion in the communication system as early as possible. In this paper, we have employed five machine learning algorithms such as Logistic regression, Support vector machines, Random forest, Gradient Boosted Decision trees & Naive Bayes for detecting the attack traffic. For processing and detecting the attack traffic as fast as possible, we have used Apache Spark, a big data processing tool for detecting and analysis of intrusion in the communication network traffic. Performance comparison of intrusion detection schemes are evaluated in terms of training time, prediction time, accuracy, sensitivity and specificity on a real time KDD'99 data set.

[1]  Yuan-Cheng Lai,et al.  Statistical analysis of false positives and false negatives from real traffic with intrusion detection/prevention systems , 2012, IEEE Communications Magazine.

[2]  Richard Lippmann,et al.  The 1999 DARPA off-line intrusion detection evaluation , 2000, Comput. Networks.

[3]  G Kalyani,et al.  Performance Assessment of Different Classification Techniques for Intrusion Detection , 2012 .

[4]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[5]  Ralf Klinkenberg,et al.  Data Classification: Algorithms and Applications , 2014 .

[6]  Xiangjian He,et al.  Enhancing Big Data Security with Collaborative Intrusion Detection , 2014, IEEE Cloud Computing.

[7]  Mohammad Zulkernine,et al.  Anomaly Based Network Intrusion Detection with Unsupervised Outlier Detection , 2006, 2006 IEEE International Conference on Communications.

[8]  Ming-Yuh Huang,et al.  A large scale distributed intrusion detection framework based on attack strategy analysis , 1999, Comput. Networks.

[9]  Jugal K. Kalita,et al.  Network attacks: Taxonomy, tools and systems , 2014, J. Netw. Comput. Appl..

[10]  Vipin Kumar,et al.  A Comparative Study of Classification Techniques for Intrusion Detection , 2013, 2013 International Symposium on Computational and Business Intelligence.

[11]  Deokjai Choi,et al.  Application of Data Mining to Network Intrusion Detection: Classifier Selection Model , 2008, APNOMS.

[12]  M. Hemalatha,et al.  Perspective analysis of machine learning algorithms for detecting network intrusions , 2012, 2012 Third International Conference on Computing, Communication and Networking Technologies (ICCCNT'12).

[13]  Jaideep Srivastava,et al.  A Comparative Study of Anomaly Detection Schemes in Network Intrusion Detection , 2003, SDM.

[14]  Susan M. Bridges,et al.  FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION , 2002 .