Drift-detection Based Incremental Ensemble for Reacting to Different Kinds of Concept Drift

Data stream mining has attracted attention in recent years due to its wide range of applications. Concept drift is a great challenge for learning data streams. The existing algorithms are generally designed for a particular type of concept drift. However, real-world data stream applications are always complex combinations of many types of concept drift. In this paper, We proposed a data-stream ensemble classifier for reacting to different types of concept drift, called Drift-detection based Incremental Ensemble (DIE). DIE combines the operators of concept-drift detection and component update mechanism to handle concept drift. In the chunk-based framework, a drift detector is used to monitor the dynamics of data distribution. When a concept drift is triggered, DIE uses the alternative tree of Hoeffding Adaptive Tree to replace the old one, rather than just updating the weights of ensemble members, which can enhance the ability of the model to deal with sudden drift. We also present a component update mechanism to adjust previous ensemble members using the latest examples. Thus, DIE is suitable for handling slow drift. Experimental studies demonstrate the effectiveness of DIE in dealing with different kinds of concept drift.

[1]  Mykola Pechenizkiy,et al.  An Overview of Concept Drift Applications , 2016 .

[2]  Jerzy Stefanowski,et al.  Reacting to Different Types of Concept Drift: The Accuracy Updated Ensemble Algorithm , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[3]  Roberto Souto Maior de Barros,et al.  Concept drift detection based on Fisher's Exact test , 2018, Inf. Sci..

[4]  Antoine Cornuéjols,et al.  A New On-Line Learning Method for Coping with Recurring Concepts: The ADACC System , 2013, ICONIP.

[5]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[6]  Richard Brendon Kirkby,et al.  Improving Hoeffding Trees , 2007 .

[7]  João Gama,et al.  A survey on concept drift adaptation , 2014, ACM Comput. Surv..

[8]  Geoff Hulten,et al.  Mining time-changing data streams , 2001, KDD '01.

[9]  João Gama,et al.  Learning with Drift Detection , 2004, SBIA.

[10]  Xin Yao,et al.  DDD: A New Ensemble Approach for Dealing with Concept Drift , 2012, IEEE Transactions on Knowledge and Data Engineering.

[11]  Marcus A. Maloof,et al.  Dynamic Weighted Majority: An Ensemble Method for Drifting Concepts , 2007, J. Mach. Learn. Res..

[12]  William Nick Street,et al.  A streaming ensemble algorithm (SEA) for large-scale classification , 2001, KDD '01.

[13]  Geoff Holmes,et al.  MOA: Massive Online Analysis , 2010, J. Mach. Learn. Res..

[14]  Geoff Hulten,et al.  Mining high-speed data streams , 2000, KDD '00.

[15]  Albert Bifet,et al.  Efficient Online Evaluation of Big Data Stream Classifiers , 2015, KDD.

[16]  A. Bifet,et al.  Early Drift Detection Method , 2005 .

[17]  Jie Sun,et al.  Dynamic financial distress prediction with concept drift based on time weighting combined with Adaboost support vector machine ensemble , 2017, Knowl. Based Syst..

[18]  Jerzy Stefanowski,et al.  Combining block-based and online methods in learning ensembles from concept drifting data streams , 2014, Inf. Sci..

[19]  Robi Polikar,et al.  Incremental Learning of Concept Drift in Nonstationary Environments , 2011, IEEE Transactions on Neural Networks.

[20]  Albert Bifet,et al.  Massive Online Analysis , 2009 .

[21]  Manfred K. Warmuth,et al.  The weighted majority algorithm , 1989, 30th Annual Symposium on Foundations of Computer Science.

[22]  Philip S. Yu,et al.  Mining concept-drifting data streams using ensemble classifiers , 2003, KDD '03.

[23]  Jesús S. Aguilar-Ruiz,et al.  Knowledge discovery from data streams , 2009, Intell. Data Anal..

[24]  Stuart J. Russell,et al.  Experimental comparisons of online and batch versions of bagging and boosting , 2001, KDD '01.