DRED: An evolutionary diversity generation method for concept drift adaptation in online learning environments

Abstract Nowadays fast-arriving information flows lay the basis of many data mining applications. Such data streams are usually affected by non-stationary events that eventually change their distribution (concept drift), causing that predictive models trained over these data become obsolete and do not adapt suitably to the new distribution. Specially in online learning scenarios, there is a pressing need for new algorithms that adapt to this change as fast as possible, while maintaining good performance scores. Recent studies have revealed that a good strategy is to construct highly diverse ensembles towards utilizing them shortly after the drift (independently from the type of drift) to obtain good performance scores. However, the existence of the so-called trade-off between stability (performance over stable data concepts) and plasticity (recovery and adaptation after drift events) implies that the construction of the ensemble model should account simultaneously for these two conflicting objectives. In this regard, this work presents a new approach to artificially generate an optimal diversity level when building prediction ensembles once shortly after a drift occurs. The approach uses a Kernel Density Estimation (KDE) method to generate synthetic data, which are subsequently labeled by means a multi-objective optimization method that allows training each model of the ensemble with a different subset of synthetic samples. Computational experiments reveal that the proposed approach can be hybridized with other traditional diversity generation approaches, yielding optimized levels of diversity that render an enhanced recovery from drifts.

[1]  Mykola Pechenizkiy,et al.  An Overview of Concept Drift Applications , 2016 .

[2]  Wai Lam,et al.  Discovering Useful Concept Prototypes for Classification Based on Filtering and Abstraction , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Robert Givan,et al.  Online Ensemble Learning: An Empirical Study , 2000, Machine Learning.

[4]  Amir F. Atiya,et al.  Self-generating prototypes for pattern classification , 2007, Pattern Recognit..

[5]  João Gama,et al.  Learning with Drift Detection , 2004, SBIA.

[6]  Xin Yao,et al.  The Impact of Diversity on Online Ensemble Learning in the Presence of Concept Drift , 2010, IEEE Transactions on Knowledge and Data Engineering.

[7]  Cesare Alippi,et al.  An adaptive CUSUM-based test for signal change detection , 2006, 2006 IEEE International Symposium on Circuits and Systems.

[8]  Mahardhika Pratama,et al.  An Incremental Type-2 Meta-Cognitive Extreme Learning Machine , 2017, IEEE Transactions on Cybernetics.

[9]  Jean Paul Barddal,et al.  A survey on feature drift adaptation: Definition, benchmark, challenges and future directions , 2017, J. Syst. Softw..

[10]  João Gama,et al.  Ensemble learning for data stream analysis: A survey , 2017, Inf. Fusion.

[11]  David Corne,et al.  The Pareto archived evolution strategy: a new baseline algorithm for Pareto multiobjective optimisation , 1999, Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406).

[12]  Hao Wang,et al.  Learning concept-drifting data streams with random ensemble decision trees , 2015, Neurocomputing.

[13]  Thomas G. Dietterich,et al.  Solving Multiclass Learning Problems via Error-Correcting Output Codes , 1994, J. Artif. Intell. Res..

[14]  Xin Yao,et al.  DDD: A New Ensemble Approach for Dealing with Concept Drift , 2012, IEEE Transactions on Knowledge and Data Engineering.

[15]  Gregory Ditzler,et al.  Learning in Nonstationary Environments: A Survey , 2015, IEEE Computational Intelligence Magazine.

[16]  Liliane dos Santos Machado,et al.  Online Assessment in Medical Simulators Based on Virtual Reality Using Fuzzy Gaussian Naive Bayes , 2012, J. Multiple Valued Log. Soft Comput..

[17]  Lawrence O. Hall,et al.  A New Ensemble Diversity Measure Applied to Thinning Ensembles , 2003, Multiple Classifier Systems.

[18]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[19]  Danijel Skocaj,et al.  Multivariate online kernel density estimation with Gaussian kernels , 2011, Pattern Recognit..

[20]  Kagan Tumer,et al.  Error Correlation and Error Reduction in Ensemble Classifiers , 1996, Connect. Sci..

[21]  Javier Del Ser,et al.  On the Creation of Diverse Ensembles for Nonstationary Environments Using Bio-inspired Heuristics , 2017, ICHSA.

[22]  Saso Dzeroski,et al.  Online tree-based ensembles and option trees for regression on evolving data streams , 2015, Neurocomputing.

[23]  Roberto Souto Maior de Barros,et al.  A comparative study on concept drift detectors , 2014, Expert Syst. Appl..

[24]  João Gama,et al.  A survey on concept drift adaptation , 2014, ACM Comput. Surv..

[25]  Stephen Grossberg,et al.  Nonlinear neural networks: Principles, mechanisms, and architectures , 1988, Neural Networks.

[26]  S. W. Roberts Control chart tests based on geometric moving averages , 2000 .

[27]  Ludmila I. Kuncheva,et al.  Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy , 2003, Machine Learning.

[28]  Khaled Ghédira,et al.  Self-Adaptive Windowing Approach for Handling Complex Concept Drift , 2015, Cognitive Computation.

[29]  Stuart J. Russell,et al.  Experimental comparisons of online and batch versions of bagging and boosting , 2001, KDD '01.

[30]  Geoff Hulten,et al.  A General Framework for Mining Massive Data Streams , 2003 .

[31]  A. Dawid,et al.  Prequential probability: principles and properties , 1999 .

[32]  Xin-She Yang,et al.  Swarm Intelligence and Bio-Inspired Computation , 2013 .

[33]  Yang Weng,et al.  Online bad data detection using kernel density estimation , 2015, 2015 IEEE Power & Energy Society General Meeting.

[34]  Daniel Hernández-Lobato,et al.  Class-switching neural network ensembles , 2008, Neurocomputing.

[35]  Xin Yao,et al.  Two_Arch2: An Improved Two-Archive Algorithm for Many-Objective Optimization , 2015, IEEE Transactions on Evolutionary Computation.

[36]  Leo Breiman,et al.  Randomizing Outputs to Increase Prediction Accuracy , 2000, Machine Learning.

[37]  Ricard Gavaldà,et al.  Learning from Time-Changing Data with Adaptive Windowing , 2007, SDM.

[38]  Khaled Ghédira,et al.  Discussion and review on evolving data streams and concept drift adapting , 2018, Evol. Syst..

[39]  Tin Kam Ho,et al.  C4.5 decision forests , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[40]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[41]  Jean Paul Barddal,et al.  A Survey on Ensemble Learning for Data Stream Classification , 2017, ACM Comput. Surv..

[42]  Danijel Skocaj,et al.  Online kernel density estimation for interactive learning , 2010, Image Vis. Comput..

[43]  William Nick Street,et al.  A streaming ensemble algorithm (SEA) for large-scale classification , 2001, KDD '01.

[44]  Stephan M. Winkler,et al.  Sliding Window Symbolic Regression for Detecting Changes of System Dynamics , 2014, GPTP.

[45]  Zbigniew Michalewicz,et al.  Time Series Forecasting for Dynamic Environments: The DyFor Genetic Program Model , 2007, IEEE Transactions on Evolutionary Computation.

[46]  Plamen Angelov,et al.  Autonomous Learning Systems: From Data Streams to Knowledge in Real-time , 2013 .

[47]  Antoine Cornuéjols,et al.  Online Learning: Searching for the Best Forgetting Strategy under Concept Drift , 2013, ICONIP.

[48]  Bernhard Sendhoff,et al.  A systems approach to evolutionary multiobjective structural optimization and beyond , 2009, IEEE Computational Intelligence Magazine.

[49]  Geoffrey I. Webb,et al.  Characterizing concept drift , 2015, Data Mining and Knowledge Discovery.

[50]  Neil A. Kelson,et al.  An FPGA-based approach to multi-objective evolutionary algorithm for multi-disciplinary design optimisation , 2011 .

[51]  Lawrence O. Hall,et al.  Ensemble diversity measures and their application to thinning , 2004, Inf. Fusion.

[52]  Liliane dos Santos Machado,et al.  Gaussian Naive Bayes for Online Training Assessment in Virtual Reality-Based Simulators , 2009, SOCO 2009.

[53]  Xin Yao,et al.  An analysis of diversity measures , 2006, Machine Learning.

[54]  Raymond J. Mooney,et al.  Creating diversity in ensembles using artificial data , 2005, Inf. Fusion.

[55]  A. Bifet,et al.  Early Drift Detection Method , 2005 .

[56]  Mahardhika Pratama,et al.  Scaffolding type-2 classifier for incremental learning under concept drifts , 2016, Neurocomputing.

[57]  Nitesh V. Chawla,et al.  An Incremental Learning Algorithm for Non-Stationary Environments and Class Imbalance , 2010 .

[58]  Graham J. Williams,et al.  Big Data Opportunities and Challenges: Discussions from Data Analytics Perspectives [Discussion Forum] , 2014, IEEE Computational Intelligence Magazine.

[59]  Edwin Lughofer,et al.  Evolving Fuzzy Systems - Methodologies, Advanced Concepts and Applications , 2011, Studies in Fuzziness and Soft Computing.

[60]  Gonzalo Martínez-Muñoz,et al.  Switching class labels to generate classification ensembles , 2005, Pattern Recognit..

[61]  Bartosz Krawczyk,et al.  One-class classifiers with incremental learning and forgetting for data streams with concept drift , 2015, Soft Comput..