Combining Feature Selection and Local Modelling in the KDD Cup 99 Dataset

In this work, a new approach for intrusion detection in computer networks is introduced. Using the KDD Cup 99 dataset as a benchmark, the proposed method consists of a combination between feature selection methods and a novel local classification method. This classification method ---called FVQIT (Frontier Vector Quantization using Information Theory)--- uses a modified clustering algorithm to split up the feature space into several local models, in each of which the classification task is performed independently. The method is applied over the KDD Cup 99 dataset, with the objective of improving performance achieved by previous authors. Experimental results obtained indicate the adequacy of the proposed approach.

[1]  Itzhak Levin,et al.  KDD-99 classifier learning contest LLSoft's results overview , 2000, SKDD.

[2]  Charles Elkan,et al.  Results of the KDD'99 classifier learning , 2000, SKDD.

[3]  Huan Liu,et al.  Searching for Interacting Features , 2007, IJCAI.

[4]  Amparo Alonso-Betanzos,et al.  Classification of computer intrusions using functional networks. A comparative study , 2007, ESANN.

[5]  Masoud Nikravesh,et al.  Feature Extraction - Foundations and Applications , 2006, Feature Extraction.

[6]  Amparo Alonso-Betanzos,et al.  A Global Optimum Approach for One-Layer Neural Networks , 2002, Neural Computation.

[7]  Masoud Nikravesh,et al.  Feature Extraction: Foundations and Applications (Studies in Fuzziness and Soft Computing) , 2006 .

[8]  Peter Gr Unwald The minimum description length principle and reasoning under uncertainty , 1998 .

[9]  Deniz Erdogmus,et al.  Vector quantization using information theoretic concepts , 2005, Natural Computing.

[10]  Amparo Alonso-Betanzos,et al.  A new supervised local modelling classifier based on information theory , 2009, 2009 International Joint Conference on Neural Networks.

[11]  Gürsel Serpen,et al.  Why machine learning algorithms fail in misuse detection on KDD intrusion detection data set , 2004, Intell. Data Anal..

[12]  Huan Liu,et al.  Consistency-based search in feature selection , 2003, Artif. Intell..

[13]  Ray Hunt,et al.  Intrusion detection techniques and approaches , 2002, Comput. Commun..

[14]  Usama M. Fayyad,et al.  Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[15]  Geoffrey I. Webb,et al.  Proportional k-Interval Discretization for Naive-Bayes Classifiers , 2001, ECML.

[16]  James R. Gattiker,et al.  Computer Intrusion Detection with Classification and Anomaly Detection, Using SVMs , 2003, Int. J. Pattern Recognit. Artif. Intell..

[17]  Verónica Bolón-Canedo,et al.  A combination of discretization and filter methods for improving classification performance in KDD Cup 99 dataset , 2009, 2009 International Joint Conference on Neural Networks.