Ensemble classifiers for drift detection and monitoring in dynamical environments

Detecting and monitoring changes during the learning process are important areas of research in many industrial applications. The challenging issue is how to diagnose and analyze these changes so that the accuracy of the learning model can be preserved. Recently, ensemble classifiers have achieved good results when dealing with concept drifts. This paper presents two ensembles learning algorithms BagEDIST and BoostEDIST, which respectively combine the Online Bagging and the Online Boosting with the drift detection method EDIST. EDIST is a new drift detection method which monitors the distance between two consecutive errors of classification. The idea behind this combination is to develop an ensemble learning algorithm which explicitly handles concept drifts by providing useful descriptions about location, speed and severity of drifts. Moreover, this paper presents a new drift diversity measure in order to study the diversity of base classifiers and see how they cope with concept drifts. From various experiments, this new measure has provided a clearer vision about the ensemble’s behavior when dealing with concept drifts 1 .

[1]  Tomasz Imielinski,et al.  Database Mining: A Performance Perspective , 1993, IEEE Trans. Knowl. Data Eng..

[2]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[3]  M. Harries SPLICE-2 Comparative Evaluation: Electricity Pricing , 1999 .

[4]  Geoff Hulten,et al.  Mining high-speed data streams , 2000, KDD '00.

[5]  Padraig Cunningham,et al.  Diversity versus Quality in Classification Ensembles Based on Feature Selection , 2000, ECML.

[6]  Geoff Hulten,et al.  Mining time-changing data streams , 2001, KDD '01.

[7]  Marcus A. Maloof,et al.  Dynamic weighted majority: a new ensemble method for tracking concept drift , 2003, Third IEEE International Conference on Data Mining.

[8]  Ludmila I. Kuncheva,et al.  Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy , 2003, Machine Learning.

[9]  Svetha Venkatesh,et al.  Using multiple windows to track concept drift , 2004, Intell. Data Anal..

[10]  Ludmila I. Kuncheva,et al.  Classifier Ensembles for Changing Environments , 2004, Multiple Classifier Systems.

[11]  Ralf Klinkenberg,et al.  Learning drifting concepts: Example selection vs. example weighting , 2004, Intell. Data Anal..

[12]  A. Bifet,et al.  Early Drift Detection Method , 2005 .

[13]  Stuart J. Russell,et al.  Online bagging and boosting , 2005, 2005 IEEE International Conference on Systems, Man and Cybernetics.

[14]  João Gama,et al.  Learning with Local Drift Detection , 2006, ADMA.

[15]  Ricard Gavaldà,et al.  Learning from Time-Changing Data with Adaptive Windowing , 2007, SDM.

[16]  Geoff Holmes,et al.  New ensemble methods for evolving data streams , 2009, KDD.

[17]  João Gama,et al.  Regression Trees from Data Streams with Drift Detection , 2009, Discovery Science.

[18]  João Gama,et al.  Issues in evaluation of stream learning algorithms , 2009, KDD.

[19]  Xin Yao,et al.  The Impact of Diversity on Online Ensemble Learning in the Presence of Concept Drift , 2010, IEEE Transactions on Knowledge and Data Engineering.

[20]  Geoff Holmes,et al.  Leveraging Bagging for Evolving Data Streams , 2010, ECML/PKDD.

[21]  Geoff Holmes,et al.  MOA: Massive Online Analysis , 2010, J. Mach. Learn. Res..

[22]  Jerzy Stefanowski,et al.  Accuracy Updated Ensemble for Data Streams with Concept Drift , 2011, HAIS.

[23]  Plamen P. Angelov,et al.  Handling drifts and shifts in on-line data streams with evolving fuzzy systems , 2011, Appl. Soft Comput..

[24]  Bhavani M. Thuraisingham,et al.  Classification and Novel Class Detection in Concept-Drifting Data Streams under Time Constraints , 2011, IEEE Transactions on Knowledge and Data Engineering.

[25]  Hamid Beigy,et al.  New Drift Detection Method for Data Streams , 2011, ICAIS.