Incremental Learning of Concept Drift in Nonstationary Environments

We introduce an ensemble of classifiers-based approach for incremental learning of concept drift, characterized by nonstationary environments (NSEs), where the underlying data distributions change over time. The proposed algorithm, named Learn++.NSE, learns from consecutive batches of data without making any assumptions on the nature or rate of drift; it can learn from such environments that experience constant or variable rate of drift, addition or deletion of concept classes, as well as cyclical drift. The algorithm learns incrementally, as other members of the Learn++ family of algorithms, that is, without requiring access to previously seen data. Learn++.NSE trains one new classifier for each batch of data it receives, and combines these classifiers using a dynamically weighted majority voting. The novelty of the approach is in determining the voting weights, based on each classifier's time-adjusted accuracy on current and past environments. This approach allows the algorithm to recognize, and act accordingly, to the changes in underlying data distributions, as well as to a possible reoccurrence of an earlier distribution. We evaluate the algorithm on several synthetic datasets designed to simulate a variety of nonstationary environments, as well as a real-world weather prediction dataset. Comparisons with several other approaches are also included. Results indicate that Learn++.NSE can track the changing environments very closely, regardless of the type of concept drift. To allow future use, comparison and benchmarking by interested researchers, we also release our data used in this paper.

[1]  Philip S. Yu,et al.  Mining concept-drifting data streams using ensemble classifiers , 2003, KDD '03.

[2]  Abraham Bernstein,et al.  Entropy-based Concept Shift Detection , 2006, Sixth International Conference on Data Mining (ICDM'06).

[3]  Kyosuke Nishida,et al.  Adaptive Classifiers-Ensemble System for Tracking Concept Drift , 2007, 2007 International Conference on Machine Learning and Cybernetics.

[4]  Haixun Wang,et al.  A Low-Granularity Classifier for Data Streams with Concept Drifts and Biased Class Distribution , 2007, IEEE Transactions on Knowledge and Data Engineering.

[5]  Ludmila I. Kuncheva,et al.  Classifier Ensembles for Changing Environments , 2004, Multiple Classifier Systems.

[6]  David B. Skillicorn,et al.  Classification Using Streaming Random Forests , 2011, IEEE Transactions on Knowledge and Data Engineering.

[7]  F. Bartlett,et al.  Remembering: A Study in Experimental and Social Psychology , 1932 .

[8]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[9]  Abraham Kandel,et al.  Real-time data mining of non-stationary data streams from sensor networks , 2008, Inf. Fusion.

[10]  Abraham Kandel,et al.  Info-fuzzy algorithms for mining dynamic data streams , 2008, Appl. Soft Comput..

[11]  Ralf Klinkenberg,et al.  Learning drifting concepts: Example selection vs. example weighting , 2004, Intell. Data Anal..

[12]  Roman Garnett,et al.  Sequential non-stationary dynamic classification with sparse feedback , 2010, Pattern Recognit..

[13]  S. Hoeglinger,et al.  Use of Hoeffding trees in concept based data stream mining , 2007, 2007 Third International Conference on Information and Automation for Sustainability.

[14]  Ralf Klinkenberg,et al.  Boosting classifiers for drifting concepts , 2007, Intell. Data Anal..

[15]  Rafael Morales Bueno,et al.  Learning in Environments with Unknown Dynamics: Towards more Robust Concept Learners , 2007, J. Mach. Learn. Res..

[16]  Haibo He,et al.  Towards incremental learning of nonstationary imbalanced data stream: a multiple selectively recursive approach , 2011, Evol. Syst..

[17]  Xindong Wu,et al.  Dynamic classifier selection for effective mining from noisy data streams , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[18]  Geoff Hulten,et al.  Mining time-changing data streams , 2001, KDD '01.

[19]  João Gama,et al.  Learning with Drift Detection , 2004, SBIA.

[20]  Xin Yao,et al.  The Impact of Diversity on Online Ensemble Learning in the Presence of Concept Drift , 2010, IEEE Transactions on Knowledge and Data Engineering.

[21]  Geoff Holmes,et al.  Accurate Ensembles for Data Streams: Combining Restricted Hoeffding Trees using Stacking , 2010, ACML.

[22]  Cesare Alippi,et al.  Just-in-Time Adaptive Classifiers—Part II: Designing the Classifier , 2008, IEEE Transactions on Neural Networks.

[23]  Ludmila I. Kuncheva,et al.  Classifier Ensembles for Detecting Concept Change in Streaming Data: Overview and Perspectives , 2008 .

[24]  Gavin Brown,et al.  Learn++.MF: A random subspace approach for the missing feature problem , 2010, Pattern Recognit..

[25]  L. S. Vygotskiĭ,et al.  Mind in society : the development of higher psychological processes , 1978 .

[26]  Philip S. Yu,et al.  A General Framework for Mining Concept-Drifting Data Streams with Skewed Distributions , 2007, SDM.

[27]  W. Bastiaan Kleijn,et al.  Codebook-Based Bayesian Speech Enhancement for Nonstationary Environments , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[28]  J. Piaget Six Psychological Studies , 1967 .

[29]  Leszek Rutkowski,et al.  Adaptive probabilistic neural networks for pattern classification in time-varying environment , 2004, IEEE Transactions on Neural Networks.

[30]  Cesare Alippi,et al.  Just-in-Time Adaptive Classifiers—Part I: Detecting Nonstationary Changes , 2008, IEEE Transactions on Neural Networks.

[31]  Robi Polikar,et al.  Learning concept drift in nonstationary environments using an ensemble of classifiers based approach , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[32]  Mykola Pechenizkiy,et al.  Handling Local Concept Drift with Dynamic Integration of Classifiers: Domain of Antibiotic Resistance in Nosocomial Infections , 2006, 19th IEEE Symposium on Computer-Based Medical Systems (CBMS'06).

[33]  M. Appel,et al.  Equilibration : theory, research, and application , 1977 .

[34]  Mykola Pechenizkiy,et al.  Dynamic integration of classifiers for handling concept drift , 2008, Inf. Fusion.

[35]  Robi Polikar,et al.  Incremental learning in non-stationary environments with concept drift using a multiple classifier based approach , 2008, 2008 19th International Conference on Pattern Recognition.

[36]  Robi Polikar,et al.  Incremental Learning of Variable Rate Concept Drift , 2009, MCS.

[37]  J. Flavell Piaget's Legacy , 1996 .

[38]  R. Schapire The Strength of Weak Learnability , 1990, Machine Learning.

[39]  Avrim Blum,et al.  Empirical Support for Winnow and Weighted-Majority Algorithms: Results on a Calendar Scheduling Domain , 2004, Machine Learning.

[40]  Niall M. Adams,et al.  lambda-Perceptron: An adaptive classifier for data streams , 2011, Pattern Recognit..

[41]  Marcus A. Maloof,et al.  Dynamic weighted majority: a new ensemble method for tracking concept drift , 2003, Third IEEE International Conference on Data Mining.

[42]  Haibo He,et al.  IMORL: Incremental Multiple-Object Recognition and Localization , 2008, IEEE Transactions on Neural Networks.

[43]  William Nick Street,et al.  A streaming ensemble algorithm (SEA) for large-scale classification , 2001, KDD '01.

[44]  Vasant Honavar,et al.  Learn++: an incremental learning algorithm for supervised neural networks , 2001, IEEE Trans. Syst. Man Cybern. Part C.

[45]  Robi Polikar,et al.  Learn$^{++}$ .NC: Combining Ensemble of Classifiers With Dynamically Weighted Consult-and-Vote for Efficient Incremental Learning of New Classes , 2009, IEEE Transactions on Neural Networks.

[46]  Nikunj C. Oza,et al.  Online Ensemble Learning , 2000, AAAI/IAAI.

[47]  E. A. de Oliveira The Rosenblatt Bayesian Algorithm Learning in a Nonstationary Environment , 2007, IEEE Transactions on Neural Networks.

[48]  Sameer Singh,et al.  Novelty detection: a review - part 2: : neural network based approaches , 2003, Signal Process..

[49]  Robi Polikar,et al.  Incremental learning in nonstationary environments with controlled forgetting , 2009, 2009 International Joint Conference on Neural Networks.

[50]  Neil D. Lawrence,et al.  Dataset Shift in Machine Learning , 2009 .

[51]  Marcus A. Maloof,et al.  Dynamic Weighted Majority: An Ensemble Method for Drifting Concepts , 2007, J. Mach. Learn. Res..

[52]  Jiawei Han,et al.  On Appropriate Assumptions to Mine Data Streams: Analysis and Practice , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[53]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[54]  J. C. Schlimmer,et al.  Incremental learning from noisy data , 2004, Machine Learning.

[55]  Albert Bifet,et al.  Adaptive learning and mining for data streams and frequent patterns , 2009, SKDD.

[56]  L. Vygotsky Mind in Society: The Development of Higher Psychological Processes: Harvard University Press , 1978 .

[57]  Stephen Grossberg,et al.  Nonlinear neural networks: Principles, mechanisms, and architectures , 1988, Neural Networks.

[58]  Brian J. Reiser,et al.  Scaffolding Complex Learning: The Mechanisms of Structuring and Problematizing Student Work , 2004, The Journal of the Learning Sciences.

[59]  Wei-Pang Yang,et al.  Mining decision rules on data streams in the presence of concept drifts , 2009, Expert Syst. Appl..

[60]  Philip S. Yu,et al.  Classifying Data Streams with Skewed Class Distributions and Concept Drifts , 2008, IEEE Internet Computing.

[61]  N. Littlestone Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[62]  Gerhard Widmer,et al.  Learning in the Presence of Concept Drift and Hidden Contexts , 1996, Machine Learning.