Concept Drift Detection for Graph-Structured Classifiers under Scarcity of True Labels

Data stream classifiers that can withstand unusual phenomena in an evolving data stream, such as concept drift and concept evolution, are highly desirable for data stream mining. Most existing methods deal with such phenomena in a supervised manner, which is costly in a real-world scenario. To address this shortcoming, we propose a concept drift detection approach that combines our approach with a semi-supervised adaptive incremental neural gas (A2ING) classifier. Our approach makes use of A2ING's graph topology structure to detect changes in a data stream. We derive a graph's instability around its decision boundary and find the difference in prior and posterior distributions of the criteria. The empirical results show the effectiveness of our method. The classifier requires a relatively low number of true labels compared to existing approaches and shows high effectiveness in change detection.

[1]  Yolande Belaïd,et al.  Efficient Active Novel Class Detection for Data Stream Classification , 2014, 2014 22nd International Conference on Pattern Recognition.

[2]  Wei Fan,et al.  Systematic data selection to mine concept-drifting data streams , 2004, KDD.

[3]  A. Asuncion,et al.  UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences , 2007 .

[4]  Gerhard Widmer,et al.  Learning in the presence of concept drift and hidden contexts , 2004, Machine Learning.

[5]  João Gama,et al.  Decision trees for mining data streams , 2006, Intell. Data Anal..

[6]  Ingrid Renz,et al.  Adaptive Information Filtering : Learning Drifting Concepts , 1998 .

[7]  João Gama,et al.  A survey on concept drift adaptation , 2014, ACM Comput. Surv..

[8]  Li Zhang,et al.  An adaptive ensemble classifier for mining concept drifting data streams , 2013, Expert Syst. Appl..

[9]  Mehmed M. Kantardzic,et al.  Don't Pay for Validation: Detecting Drifts from Unlabeled data Using Margin Density , 2015, INNS Conference on Big Data.

[10]  Bhavani M. Thuraisingham,et al.  Cloud Guided Stream Classification Using Class-Based Ensemble , 2012, 2012 IEEE Fifth International Conference on Cloud Computing.

[11]  William Nick Street,et al.  A streaming ensemble algorithm (SEA) for large-scale classification , 2001, KDD '01.

[12]  Geoff Holmes,et al.  MOA: Massive Online Analysis , 2010, J. Mach. Learn. Res..

[13]  Jesús S. Aguilar-Ruiz,et al.  Knowledge discovery from data streams , 2009, Intell. Data Anal..

[14]  Eduardo Jaques Spinosa,et al.  Novelty detection with application to data streams , 2009, Intell. Data Anal..

[15]  Zhi-Hua Zhou,et al.  Streaming Classification with Emerging New Class by Class Matrix Sketching , 2017, AAAI.

[16]  Charu C. Aggarwal,et al.  Efficient handling of concept drift and concept evolution over Stream Data , 2016, 2016 IEEE 32nd International Conference on Data Engineering (ICDE).

[17]  Alexey Tsymbal,et al.  The problem of concept drift: definitions and related work , 2004 .

[18]  Bhavani M. Thuraisingham,et al.  Classification and Novel Class Detection of Data Streams in a Dynamic Feature Space , 2010, ECML/PKDD.

[19]  Bhavani M. Thuraisingham,et al.  Classification and Novel Class Detection in Concept-Drifting Data Streams under Time Constraints , 2011, IEEE Transactions on Knowledge and Data Engineering.

[20]  Yolande Belaïd,et al.  An Adaptive Incremental Clustering Method based on the Growing Neural Gas Algorithm , 2013, ICPRAM.