An ensemble based incremental learning framework for concept drift and class imbalance

We have recently introduced an incremental learning algorithm, Learn++.NSE, designed to learn in nonstationary environments, and has been shown to provide an attractive solution to a number of concept drift problems under different drift scenarios. However, Learn++.NSE relies on error to weigh the classifiers in the ensemble on the most recent data. For balanced class distributions, this approach works very well, but when faced with imbalanced data, error is no longer an acceptable measure of performance. On the other hand, the well-established SMOTE algorithm can address the class imbalance issue, however, it cannot learn in nonstationary environments. While there is some literature available for learning in nonstationary environments and imbalanced data separately, the combined problem of learning from imbalanced data coming from nonstationary environments is underexplored. Therefore, in this work we propose two modified frameworks for an algorithm that can be used to incrementally learn from imbalanced data coming from a nonstationary environment.

[1]  Gerhard Widmer,et al.  Learning in the Presence of Concept Drift and Hidden Contexts , 1996, Machine Learning.

[2]  Ludmila I. Kuncheva,et al.  Classifier Ensembles for Changing Environments , 2004, Multiple Classifier Systems.

[3]  Stan Matwin,et al.  Machine Learning for the Detection of Oil Spills in Satellite Radar Images , 1998, Machine Learning.

[4]  Cen Li,et al.  Classifying imbalanced data using a bagging ensemble variation (BEV) , 2007, ACM-SE 45.

[5]  Philip S. Yu,et al.  Classifying Data Streams with Skewed Class Distributions and Concept Drifts , 2008, IEEE Internet Computing.

[6]  Cesare Alippi,et al.  Just-in-Time Adaptive Classifiers—Part I: Detecting Nonstationary Changes , 2008, IEEE Transactions on Neural Networks.

[7]  Marcus A. Maloof,et al.  Dynamic Weighted Majority: An Ensemble Method for Drifting Concepts , 2007, J. Mach. Learn. Res..

[8]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[9]  Nitesh V. Chawla,et al.  SMOTEBoost: Improving Prediction of the Minority Class in Boosting , 2003, PKDD.

[10]  L. Kuncheva Using Control Charts for Detecting Concept Change in Streaming Data , 2009 .

[11]  Robi Polikar,et al.  An Ensemble Approach for Incremental Learning in Nonstationary Environments , 2007, MCS.

[12]  William Nick Street,et al.  A streaming ensemble algorithm (SEA) for large-scale classification , 2001, KDD '01.

[13]  Robi Polikar,et al.  Incremental learning in nonstationary environments with controlled forgetting , 2009, 2009 International Joint Conference on Neural Networks.

[14]  R. Polikar,et al.  Multiple Classifiers Based Incremental Learning Algorithm for Learning in Nonstationary Environments , 2007, 2007 International Conference on Machine Learning and Cybernetics.

[15]  Cesare Alippi,et al.  Just in time classifiers: Managing the slow drift case , 2009, 2009 International Joint Conference on Neural Networks.

[16]  Haibo He,et al.  SERA: Selectively recursive approach towards nonstationary imbalanced stream data mining , 2009, 2009 International Joint Conference on Neural Networks.

[17]  Ludmila I. Kuncheva,et al.  Classifier Ensembles for Detecting Concept Change in Streaming Data: Overview and Perspectives , 2008 .

[18]  Philip S. Yu,et al.  A General Framework for Mining Concept-Drifting Data Streams with Skewed Distributions , 2007, SDM.

[19]  Robi Polikar,et al.  Incremental learning in non-stationary environments with concept drift using a multiple classifier based approach , 2008, 2008 19th International Conference on Pattern Recognition.

[20]  Robi Polikar,et al.  Incremental Learning of Variable Rate Concept Drift , 2009, MCS.

[21]  Stephen Grossberg,et al.  Nonlinear neural networks: Principles, mechanisms, and architectures , 1988, Neural Networks.