Feature Selection of Intrusion Detection Data using a Hybrid Genetic Algorithm/KNN Approach

Feature selection is an important part of the information processing and system development process. The selection of an appropriate set of features can provide an insight into the underlying processes present in the data set and greatly improve the accuracy of the overall classification model. In this paper we investigate the use of a hybrid genetic algorithm/k-nearest neighbour approach to features selection and apply this approach to an intrusion detection data set. We have found that this feature selection process is able to identify features that are important for identifying different types of attacks present in the, data set leading to improved classification accuracy.

[1]  Grant Dick,et al.  Weighted feature extraction using a genetic algorithm for intrusion detection , 2003, The 2003 Congress on Evolutionary Computation, 2003. CEC '03..

[2]  Richard J. Enbody,et al.  Further Research on Feature Selection and Classification Using Genetic Algorithms , 1993, ICGA.

[3]  Lawrence Davis,et al.  A Hybrid Genetic Algorithm for Classification , 1991, IJCAI.

[4]  Belur V. Dasarathy,et al.  Nearest neighbor (NN) norms: NN pattern classification techniques , 1991 .

[5]  M. Narasimha Murty,et al.  A genetic approach for selection of (near-) optimal subsets of principal components for discrimination , 1995, Pattern Recognit. Lett..

[6]  V. Rao Vemuri,et al.  Using Text Categorization Techniques for Intrusion Detection , 2002, USENIX Security Symposium.

[7]  Anil K. Jain,et al.  Dimensionality reduction using genetic algorithms , 2000, IEEE Trans. Evol. Comput..

[8]  Risto Miikkulainen,et al.  Intrusion Detection with Neural Networks , 1997, NIPS.

[9]  Manabu Kotani,et al.  Feature Extraction Using Genetic Algorithms , 1999 .

[10]  Aurobindo Sundaram,et al.  An introduction to intrusion detection , 1996, CROS.

[11]  Erik D. Goodman,et al.  Genetic Algorithm Optimized Feature Transformation - A Comparison with Different Classifiers , 2003, GECCO.

[12]  Salvatore J. Stolfo,et al.  Cost-based modeling for fraud and intrusion detection: results from the JAM project , 2000, Proceedings DARPA Information Survivability Conference and Exposition. DISCEX'00.

[13]  Salvatore J. Stolfo,et al.  A data mining framework for building intrusion detection models , 1999, Proceedings of the 1999 IEEE Symposium on Security and Privacy (Cat. No.99CB36344).

[14]  V. Rao Vemuri,et al.  Use of K-Nearest Neighbor classifier for intrusion detection , 2002, Comput. Secur..