An efficient approach to detecting concept-evolution in network data streams

An important challenge in network management and intrusion detection is the problem of data stream classification to identify new and abnormal traffic flows. An open research issue in this context is concept-evolution, which involves the emergence of a new class in the data stream. Most traditional data classification techniques are based on the assumption that the number of classes does not change over time. However, that is not the case in real world networks, and existing methods generally do not have the capability of identifying the evolution of a new class in the data stream. In this paper, we present a novel approach to the detection of novel classes in data streams that exhibit concept-evolution. In particular, our approach is able to improve both accuracy and computational efficiency by eliminating “noise” clusters in the analysis of concept evolution. Through an evaluation on simulated and benchmark data sets, we demonstrate that our approach achieves comparable accuracy to an existing scheme from the literature with a significant reduction in computational complexity.

[1]  Marcus A. Maloof,et al.  Using additive expert ensembles to cope with concept drift , 2005, ICML.

[2]  Sameer Singh,et al.  Novelty detection: a review - part 1: statistical approaches , 2003, Signal Process..

[3]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  Cluster-based novel concept detection in data streams applied to intrusion detection in computer networks , 2008, SAC '08.

[4]  Anukool Lakhina,et al.  Multivariate Online Anomaly Detection Using Kernel Recursive Least Squares , 2007, IEEE INFOCOM 2007 - 26th IEEE International Conference on Computer Communications.

[5]  Xindong Wu,et al.  Combining proactive and reactive predictions for data streams , 2005, KDD '05.

[6]  Quanyuan Wu,et al.  Mining Concept-Drifting and Noisy Data Streams Using Ensemble Classifiers , 2009, 2009 International Conference on Artificial Intelligence and Computational Intelligence.

[7]  Lionel Tarassenko,et al.  Choosing an appropriate model for novelty detection , 1997 .

[8]  Geoff Hulten,et al.  Mining time-changing data streams , 2001, KDD '01.

[9]  Philip S. Yu,et al.  Mining concept-drifting data streams using ensemble classifiers , 2003, KDD '03.

[10]  Charu C. Aggarwal,et al.  Addressing Concept-Evolution in Concept-Drifting Data Streams , 2010, 2010 IEEE International Conference on Data Mining.

[11]  Bhavani M. Thuraisingham,et al.  Classification and Novel Class Detection in Concept-Drifting Data Streams under Time Constraints , 2011, IEEE Transactions on Knowledge and Data Engineering.

[12]  Sameer Singh,et al.  Novelty detection: a review - part 2: : neural network based approaches , 2003, Signal Process..

[13]  Philip S. Yu,et al.  A framework for on-demand classification of evolving data streams , 2006, IEEE Transactions on Knowledge and Data Engineering.

[14]  Bhavani M. Thuraisingham,et al.  Integrating Novel Class Detection with Classification for Concept-Drifting Data Streams , 2009, ECML/PKDD.

[15]  Ralf Klinkenberg,et al.  An Ensemble Classifier for Drifting Concepts , 2005 .

[16]  Yiming Yang,et al.  Topic-conditioned novelty detection , 2002, KDD.

[17]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.