Generating artificial attack data for intrusion detection using machine learning

Intrusion detection based upon machine learning is currently attracting considerable interests from the research community. One of the appealing properties of machine learning based intrusion detection systems is their ability to detect new and unknown attacks. In order to apply machine learning to intrusion detection, a large number of both attack and normal data samples need to be collected. While, it is often easier to sample benign data based on the normal behaviors of networks, intrusive data is much more scarce, therefore more difficult to collect. In this paper, we propose a novel solution to this problem by generating artificial attack data for intrusion detection based on machine learning techniques. Various machine learning techniques are used to evaluate the effectiveness of the generated data and the results show that the data set of synthetic attack data combining with normal one can help machine learning methods to achieve good performance on intrusion detection problem.

[1]  Sheng-Hsun Hsu,et al.  Application of SVM and ANN for intrusion detection , 2005, Comput. Oper. Res..

[2]  N. Intrator On the combination of supervised and unsupervised learning , 1993 .

[3]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[4]  Wolfgang Banzhaf,et al.  The use of computational intelligence in intrusion detection systems: A review , 2010, Appl. Soft Comput..

[5]  Dorothy E. Denning,et al.  An Intrusion-Detection Model , 1986, 1986 IEEE Symposium on Security and Privacy.

[6]  Andrew H. Sung,et al.  Intrusion detection using an ensemble of intelligent paradigms , 2005, J. Netw. Comput. Appl..

[7]  Aaas News,et al.  Book Reviews , 1893, Buffalo Medical and Surgical Journal.

[8]  Robert Givan,et al.  Bounded Parameter Markov Decision Processes , 1997, ECP.

[9]  J. Ross Quinlan,et al.  Learning decision tree classifiers , 1996, CSUR.

[10]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[11]  Salvatore J. Stolfo,et al.  A framework for constructing features and models for intrusion detection systems , 2000, TSEC.

[12]  Charu C. Aggarwal,et al.  Outlier Analysis , 2013, Springer New York.

[13]  Li Guo,et al.  An active learning based TCM-KNN algorithm for supervised network intrusion detection , 2007, Comput. Secur..

[14]  金田 重郎,et al.  C4.5: Programs for Machine Learning (書評) , 1995 .

[15]  S. Das Elements Of Artificial Neural Networks [Book Reviews] , 1998, IEEE Transactions on Neural Networks.

[16]  David Heckerman,et al.  Learning With Bayesian Networks (Abstract) , 1995, ICML.

[17]  Nello Cristianini,et al.  An introduction to Support Vector Machines , 2000 .

[18]  Ian Witten,et al.  Data Mining , 2000 .

[19]  VanLoi Cao,et al.  A scheme for building a dataset for intrusion detection systems , 2013, 2013 Third World Congress on Information and Communication Technologies (WICT 2013).

[20]  Terrence J. Sejnowski,et al.  Unsupervised Learning , 2018, Encyclopedia of GIS.

[21]  Harry Zhang,et al.  The Optimality of Naive Bayes , 2004, FLAIRS.

[22]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..

[23]  HerreraFrancisco,et al.  Evolutionary undersampling for classification with imbalanced datasets , 2009 .

[24]  Salvatore J. Stolfo,et al.  Using artificial anomalies to detect unknown and known network intrusions , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[25]  Salvatore J. Stolfo,et al.  A data mining framework for building intrusion detection models , 1999, Proceedings of the 1999 IEEE Symposium on Security and Privacy (Cat. No.99CB36344).

[26]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[27]  Francesco Bergadano Machine Learning and the foundations of inductive inference , 2004, Minds and Machines.

[28]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[29]  Fabio Roli,et al.  Intrusion detection in computer networks by a modular ensemble of one-class classifiers , 2008, Inf. Fusion.

[30]  Wei-Yang Lin,et al.  Intrusion detection by machine learning: A review , 2009, Expert Syst. Appl..

[31]  Bo Yang,et al.  Hybrid flexible neural‐tree‐based intrusion detection systems , 2007, Int. J. Intell. Syst..

[32]  Wei Zhang,et al.  A genetic clustering method for intrusion detection , 2004, Pattern Recognit..

[33]  Hussein A. Abbass,et al.  Evaluation of an adaptive genetic-based signature extraction system for network intrusion detection , 2011, Pattern Analysis and Applications.

[34]  Mohammad Saniee Abadeh,et al.  A parallel genetic local search algorithm for intrusion detection in computer networks , 2007, Eng. Appl. Artif. Intell..

[35]  David Heckerman,et al.  A Tutorial on Learning with Bayesian Networks , 1999, Innovations in Bayesian Networks.

[36]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .