Semi-Supervised Dynamic Classification for Intrusion Detection

In this paper, a new framework to build an adaptive classifier is introduced. At first, a clustering algorithm, Density-Based Spatial Clustering of Applications with Noise (DBSCAN) is applied to a set of sample data to form initial set of clusters. The clusters are represented as classes. Using support vector machine (SVM), a classifier model is generated. In real world application, data comes in continuously. Therefore, if the model does not learn from the new data, the model may not perform as well with the new data especially when the model's training data is different from the test data. The new framework proposed in this paper rebuilds the classifier model using selected data from test data set to improve the accuracy of the model. A case study on intrusion detection data set has been performed to evaluate our methodology. The result shows that this approach lead to have more accurate classification models over time.

[1]  LastMark Online classification of nonstationary data streams , 2002 .

[2]  Qiong Jackson,et al.  An adaptive classifier design for high-dimensional data analysis with a limited training data set , 2001, IEEE Trans. Geosci. Remote. Sens..

[3]  Qiong Jackson,et al.  Adaptive Bayesian contextual classification based on Markov random fields , 2002, IEEE Trans. Geosci. Remote. Sens..

[4]  Robert E. Stepp,et al.  Concepts in Conceptual Clustering , 1987, IJCAI.

[5]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[6]  Andreas Rudolph,et al.  Techniques of Cluster Algorithms in Data Mining , 2002, Data Mining and Knowledge Discovery.

[7]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[8]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[9]  Yimin Wu,et al.  An adaptive classification method for multimedia retrieval , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[10]  Koby Crammer,et al.  Online Classification on a Budget , 2003, NIPS.

[11]  Yoram Singer,et al.  Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers , 2000, J. Mach. Learn. Res..

[12]  Hwee Tou Ng,et al.  Bayesian online classifiers for text classification and filtering , 2002, SIGIR '02.

[13]  Laura A. Mather A linear algebra measure of cluster quality , 2000 .

[14]  Shuicheng Yan,et al.  Locally adaptive classification piloted by uncertainty , 2006, ICML '06.

[15]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[16]  Dorothy E. Denning,et al.  An Intrusion-Detection Model , 1987, IEEE Transactions on Software Engineering.

[17]  Byeong Ho Kang,et al.  An Online Classification and Prediction Hybrid System for Knowledge Discovery in Databases , 2004 .