A Classifier Graph Based Recurring Concept Detection and Prediction Approach

It is common in real-world data streams that previously seen concepts will reappear, which suggests a unique kind of concept drift, known as recurring concepts. Unfortunately, most of existing algorithms do not take full account of this case. Motivated by this challenge, a novel paradigm was proposed for capturing and exploiting recurring concepts in data streams. It not only incorporates a distribution-based change detector for handling concept drift but also captures recurring concept by storing recurring concepts in a classifier graph. The possibility of detecting recurring drifts allows reusing previously learnt models and enhancing the overall learning performance. Extensive experiments on both synthetic and real-world data streams reveal that the approach performs significantly better than the state-of-the-art algorithms, especially when concepts reappear.

[1]  João Gama,et al.  Issues in evaluation of stream learning algorithms , 2009, KDD.

[2]  Gerhard Widmer,et al.  Learning in the Presence of Concept Drift and Hidden Contexts , 1996, Machine Learning.

[3]  Mohammed El Amine Bechar,et al.  Statistical Comparisons of the Top 10 Algorithms in Data Mining for Classification Task , 2016 .

[4]  Robi Polikar,et al.  Incremental Learning of Concept Drift in Nonstationary Environments , 2011, IEEE Transactions on Neural Networks.

[5]  A. Bifet,et al.  Early Drift Detection Method , 2005 .

[6]  Ricard Gavaldà,et al.  Learning from Time-Changing Data with Adaptive Windowing , 2007, SDM.

[7]  Ernestina Menasalvas Ruiz,et al.  Mining Recurring Concepts in a Dynamic Feature Space , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[8]  João Gama,et al.  A survey on concept drift adaptation , 2014, ACM Comput. Surv..

[9]  Jelle J. Goeman,et al.  Multiple hypothesis testing in genomics , 2014, Statistics in medicine.

[10]  Gregory Ditzler,et al.  Learning in Nonstationary Environments: A Survey , 2015, IEEE Computational Intelligence Magazine.

[11]  João Gama,et al.  Learning with Drift Detection , 2004, SBIA.

[12]  William Nick Street,et al.  A streaming ensemble algorithm (SEA) for large-scale classification , 2001, KDD '01.

[13]  Yun Sing Koh,et al.  Detecting concept change in dynamic data streams , 2013, Machine Learning.

[14]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[15]  Jesús S. Aguilar-Ruiz,et al.  Knowledge discovery from data streams , 2009, Intell. Data Anal..

[16]  Geoff Holmes,et al.  MOA: Massive Online Analysis , 2010, J. Mach. Learn. Res..

[17]  Z. Šidák Rectangular Confidence Regions for the Means of Multivariate Normal Distributions , 1967 .

[18]  W. Hoeffding Probability Inequalities for sums of Bounded Random Variables , 1963 .

[19]  Xin Yao,et al.  DDD: A New Ensemble Approach for Dealing with Concept Drift , 2012, IEEE Transactions on Knowledge and Data Engineering.

[20]  Marcus A. Maloof,et al.  Dynamic Weighted Majority: An Ensemble Method for Drifting Concepts , 2007, J. Mach. Learn. Res..

[21]  Liva Ralaivola,et al.  Empirical Bernstein Inequalities for U-Statistics , 2010, NIPS.

[22]  H. Chernoff A Measure of Asymptotic Efficiency for Tests of a Hypothesis Based on the sum of Observations , 1952 .

[23]  Gerhard Widmer,et al.  Tracking Context Changes through Meta-Learning , 1997, Machine Learning.

[24]  Alexey Tsymbal,et al.  The problem of concept drift: definitions and related work , 2004 .

[25]  Cynthia Rudin,et al.  Online coordinate boosting , 2008, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[26]  Abad Miguel Ángel,et al.  Predicting recurring concepts on data-streams by means of a meta-model and a fuzzy similarity function , 2016 .

[27]  Roberto Souto Maior de Barros,et al.  RCD: A recurring concept drift framework , 2013, Pattern Recognit. Lett..

[28]  Grigorios Tsoumakas,et al.  Tracking recurring contexts using ensemble classifiers: an application to email filtering , 2009, Knowledge and Information Systems.

[29]  Philip S. Yu,et al.  Mining concept-drifting data streams using ensemble classifiers , 2003, KDD '03.

[30]  Dimitris K. Tasoulis,et al.  Exponentially weighted moving average charts for detecting concept drift , 2012, Pattern Recognit. Lett..

[31]  Ernestina Menasalvas Ruiz,et al.  Predicting recurring concepts on data-streams by means of a meta-model and a fuzzy similarity function , 2016, Expert Syst. Appl..