Efficient classification using parallel and scalable compressed model and its application on intrusion detection

Abstract In order to achieve high efficiency of classification in intrusion detection, a compressed model is proposed in this paper which combines horizontal compression with vertical compression. OneR is utilized as horizontal compression for attribute reduction, and affinity propagation is employed as vertical compression to select small representative exemplars from large training data. As to be able to computationally compress the larger volume of training data with scalability, MapReduce based parallelization approach is then implemented and evaluated for each step of the model compression process abovementioned, on which common but efficient classification methods can be directly used. Experimental application study on two publicly available datasets of intrusion detection, KDD99 and CMDC2012, demonstrates that the classification using the compressed model proposed can effectively speed up the detection procedure at up to 184 times, most importantly at the cost of a minimal accuracy difference with less than 1% on average.

[1]  Fei Wang,et al.  Latent outlier detection and the low precision problem , 2013, ODD '13.

[2]  Tianbo Lu,et al.  Optimizing Network Anomaly Detection Scheme Using Instance Selection Mechanism , 2009, GLOBECOM 2009 - 2009 IEEE Global Telecommunications Conference.

[3]  Xiangliang Zhang,et al.  High-speed web attack detection through extracting exemplars from HTTP traffic , 2011, SAC.

[4]  Andrew J. Clark,et al.  Data preprocessing for anomaly based network intrusion detection: A review , 2011, Comput. Secur..

[5]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[6]  Christian General Chair-Alba Enrique Blum,et al.  Proceedings of the 15th annual conference companion on Genetic and evolutionary computation , 2013, Annual Conference on Genetic and Evolutionary Computation.

[7]  Nasser Yazdani,et al.  Mutual information-based feature selection for intrusion detection systems , 2011, J. Netw. Comput. Appl..

[8]  Kamel Mohamed Faraoun,et al.  Neural Networks Learning Improvement using the K-Means Clustering Algorithm to Detect Network Intrusions , 2007 .

[9]  Alvaro A. Cárdenas,et al.  Big Data Analytics for Security , 2013, IEEE Security & Privacy.

[10]  Dongjoon Kong,et al.  A differentiated one-class classification method with applications to intrusion detection , 2012, Expert Syst. Appl..

[11]  Verónica Bolón-Canedo,et al.  Feature selection and classification in multiple class datasets: An application to KDD Cup 99 dataset , 2011, Expert Syst. Appl..

[12]  Sherif Sakr,et al.  The family of mapreduce and large-scale data processing systems , 2013, CSUR.

[13]  Salvatore J. Stolfo,et al.  A framework for constructing features and models for intrusion detection systems , 2000, TSEC.

[14]  Lam-For Kwok,et al.  Adaptive False Alarm Filter Using Machine Learning in Intrusion Detection , 2011 .

[15]  Salvatore J. Stolfo,et al.  Data Mining Approaches for Intrusion Detection , 1998, USENIX Security Symposium.

[16]  Stefan Wrobel,et al.  Toolkit-Based High-Performance Data Mining of Large Data on MapReduce Clusters , 2009, 2009 IEEE International Conference on Data Mining Workshops.

[17]  Nasir Rashid,et al.  Analysis of Risks in Re-Engineering Software Systems , 2013 .

[18]  Robert C. Holte,et al.  Very Simple Classification Rules Perform Well on Most Commonly Used Datasets , 1993, Machine Learning.

[19]  Krzysztof Boryczko,et al.  Finding exemplars in dense data with affinity propagation on clusters of GPUs , 2013, Concurr. Comput. Pract. Exp..

[20]  Fu-Cai Chen,et al.  Online stream clustering using density and affinity propagation algorithm , 2013, 2013 IEEE 4th International Conference on Software Engineering and Service Science.

[21]  William C. Chu,et al.  Proceedings of the 2011 ACM Symposium on Applied Computing (SAC), TaiChung, Taiwan, March 21 - 24, 2011 , 2011, SAC.

[22]  Shi-Jinn Horng,et al.  A novel intrusion detection system based on hierarchical clustering and support vector machines , 2011, Expert Syst. Appl..

[23]  Felix Naumann,et al.  Data fusion , 2009, CSUR.

[24]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[25]  Andrew H. Sung,et al.  Feature Selection for Intrusion Detection with Neural Networks and Support Vector Machines , 2003 .

[26]  Martin May,et al.  A Signal Processing View on Packet Sampling and Anomaly Detection , 2010, 2010 Proceedings IEEE INFOCOM.

[27]  Xiangliang Zhang,et al.  Abstracting Audit Data for Lightweight Intrusion Detection , 2010, ICISS.

[28]  Nur Izura Udzir,et al.  Intrusion detection based on k-means clustering and OneR classification , 2011, 2011 7th International Conference on Information Assurance and Security (IAS).

[29]  Sotiris Ioannidis,et al.  Gnort: High Performance Network Intrusion Detection Using Graphics Processors , 2008, RAID.

[30]  Ibrahim Aljarah,et al.  Towards a scalable intrusion detection system based on parallel PSO clustering using mapreduce , 2013, GECCO.

[31]  Shaobo Zhong Proceedings of the 2012 International Conference on Cybernetics and Informatics , 2014 .

[32]  Farshid Keynia,et al.  Improving the Intrusion Detection Systems' Performance by Correlation as a Sample Selection Method , 2013 .

[33]  Muhammad Hussain,et al.  Optimized intrusion detection mechanism using soft computing techniques , 2013, Telecommun. Syst..

[34]  Tadeusz Pietraszek,et al.  Using Adaptive Alert Classification to Reduce False Positives in Intrusion Detection , 2004, RAID.

[35]  Bin Luo,et al.  A novel intrusion detection system based on feature generation with visualization strategy , 2014, Expert Syst. Appl..

[36]  Hülya Behret,et al.  A Fuzzy Inference System for Supply Chain Risk Management , 2011 .

[37]  Rafael Timóteo de Sousa Júnior,et al.  Building Scalable Distributed Intrusion Detection Systems Based on the MapReduce Framework , 2011 .

[38]  Emmanuel Müller,et al.  Proceedings of the ACM SIGKDD Workshop on Outlier Detection and Description , 2013, KDD 2013.

[39]  Chonghui Guo,et al.  Incremental Affinity Propagation Clustering Based on Message Passing , 2014, IEEE Transactions on Knowledge and Data Engineering.

[40]  Yinhui Li,et al.  An efficient intrusion detection system based on support vector machines and gradually feature removal method , 2012, Expert Syst. Appl..

[41]  Guojun Wang Cyberspace Safety and Security: 5th International Symposium, CSS 2013, Zhangjiajie, China, November 13-15, 2013, Proceedings , 2013, Lecture Notes in Computer Science.

[42]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.

[43]  G PrathibhaP.,et al.  Design of a Hybrid Intrusion Detection System using Snort and Hadoop , 2013 .

[44]  Bhavani M. Thuraisingham,et al.  A new intrusion detection system using support vector machines and hierarchical clustering , 2007, The VLDB Journal.

[45]  Chih-Fong Tsai,et al.  A triangle area based nearest neighbors approach to intrusion detection , 2010, Pattern Recognit..

[46]  Chen Chen,et al.  Research of Intrusion Detection Based on Clustering Analysis , 2014 .

[47]  Ajith Abraham,et al.  Feature deduction and ensemble design of intrusion detection systems , 2005, Comput. Secur..

[48]  P SpathoulasGeorgios,et al.  Reducing false positives in intrusion detection systems , 2010 .

[49]  Lars Olav Gigstad Reducing false positives in intrusion detection by means of frequent episodes , 2008 .

[50]  Huwaida Tagelsir Elshoush,et al.  Alert correlation in collaborative intelligent intrusion detection systems - A survey , 2011, Appl. Soft Comput..

[51]  Chun-Hung Richard Lin,et al.  Intrusion detection system: A comprehensive review , 2013, J. Netw. Comput. Appl..

[52]  Martin Roesch,et al.  Snort - Lightweight Intrusion Detection for Networks , 1999 .

[53]  KhanLatifur,et al.  A new intrusion detection system using support vector machines and hierarchical clustering , 2007, VLDB 2007.

[54]  Dewan Md. Farid,et al.  Anomaly Network Intrusion Detection Based on Improved Self Adaptive Bayesian Algorithm , 2010, J. Comput..

[55]  Zhang Yi,et al.  A hierarchical intrusion detection model based on the PCA neural networks , 2007, Neurocomputing.

[56]  Zhiliang Zhu,et al.  Selecting Features for Anomaly Intrusion Detection: A Novel Method using Fuzzy C Means and Decision Tree Classification , 2013, CSS.

[57]  Gaogang Xie,et al.  Scalable high-performance parallel design for Network Intrusion Detection Systems on many-core processors , 2013, Architectures for Networking and Communications Systems.