A Selective Detector Ensemble for Concept Drift Detection

Concept drifts usually originate from many causes instead of only one, which result in two types of concept drifts: abrupt drifts and gradual drifts. From the point of view of speed, concept drifts pose strong challenges for data stream mining. In this paper, we propose a selective detector ensemble to detect both abrupt and gradual drifts. We first present our detector ensemble construction method, and then introduce how to use this ensemble to detect concept drifts with the proposed early-findearly-report rule.To evaluate the performance of our method, we compare it with four drift detection methods on eight publicly available data sets containing various concept drifts. The experimental results show that compared with those benchmarks, our ensemble method can effectively improve the recall and false negative rate without significantly increasing the false positive rate, and has stronger generalization ability than those single-change-indicator-based methods.

[1]  Robert Givan,et al.  Online Ensemble Learning: An Empirical Study , 2000, Machine Learning.

[2]  Lei Du,et al.  Detecting concept drift: An information entropy based method using an adaptive sliding window , 2014, Intell. Data Anal..

[3]  Alexey Tsymbal,et al.  The problem of concept drift: definitions and related work , 2004 .

[4]  S. Salzberg,et al.  INSTANCE-BASED LEARNING : Nearest Neighbour with Generalisation , 1995 .

[5]  João Gama,et al.  Learning with Local Drift Detection , 2006, ADMA.

[6]  Lior Rokach,et al.  Ensemble-based classifiers , 2010, Artificial Intelligence Review.

[7]  Geoff Holmes,et al.  New ensemble methods for evolving data streams , 2009, KDD.

[8]  A. Dawid,et al.  Prequential probability: principles and properties , 1999 .

[9]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[10]  Philip S. Yu,et al.  Mining concept-drifting data streams using ensemble classifiers , 2003, KDD '03.

[11]  Gerhard Widmer,et al.  Learning in the presence of concept drift and hidden contexts , 2004, Machine Learning.

[12]  Xin Yao,et al.  DDD: A New Ensemble Approach for Dealing with Concept Drift , 2012, IEEE Transactions on Knowledge and Data Engineering.

[13]  Marcus A. Maloof,et al.  Dynamic Weighted Majority: An Ensemble Method for Drifting Concepts , 2007, J. Mach. Learn. Res..

[14]  Xin Yao,et al.  An analysis of diversity measures , 2006, Machine Learning.

[15]  Grigorios Tsoumakas,et al.  Tracking recurring contexts using ensemble classifiers: an application to email filtering , 2009, Knowledge and Information Systems.

[16]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[17]  D. Opitz,et al.  Popular Ensemble Methods: An Empirical Study , 1999, J. Artif. Intell. Res..

[18]  David W. Aha,et al.  Instance-Based Learning Algorithms , 1991, Machine Learning.

[19]  William Nick Street,et al.  A streaming ensemble algorithm (SEA) for large-scale classification , 2001, KDD '01.

[20]  Albert Bifet,et al.  DATA STREAM MINING A Practical Approach , 2009 .

[21]  Koichiro Yamauchi,et al.  Detecting Concept Drift Using Statistical Testing , 2007, Discovery Science.

[22]  Ricard Gavaldà,et al.  Learning from Time-Changing Data with Adaptive Windowing , 2007, SDM.

[23]  João Gama,et al.  Learning with Drift Detection , 2004, SBIA.

[24]  Xin Yao,et al.  The Impact of Diversity on Online Ensemble Learning in the Presence of Concept Drift , 2010, IEEE Transactions on Knowledge and Data Engineering.

[25]  Marcus A. Maloof,et al.  Dynamic weighted majority: a new ensemble method for tracking concept drift , 2003, Third IEEE International Conference on Data Mining.

[26]  Daniel Nikovski,et al.  Fast adaptive algorithms for abrupt change detection , 2009, Machine Learning.

[27]  A. Bifet,et al.  Early Drift Detection Method , 2005 .

[28]  Charu C. Aggarwal,et al.  An Introduction to Data Streams , 2007, Data Streams - Models and Algorithms.

[29]  Thorsten Joachims,et al.  Detecting Concept Drift with Support Vector Machines , 2000, ICML.

[30]  R. Polikar,et al.  Ensemble based systems in decision making , 2006, IEEE Circuits and Systems Magazine.

[31]  Mohamed Medhat Gaber,et al.  A Survey of Classification Methods in Data Streams , 2007, Data Streams - Models and Algorithms.

[32]  Xiaoyi Jiang,et al.  A dynamic classifier ensemble selection approach for noise data , 2010, Inf. Sci..

[33]  Shai Ben-David,et al.  Detecting Change in Data Streams , 2004, VLDB.

[34]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[35]  Jennifer Widom,et al.  Models and issues in data stream systems , 2002, PODS.

[36]  Wei Tang,et al.  Ensembling neural networks: Many could be better than all , 2002, Artif. Intell..