An automatic construction and organization strategy for ensemble learning on data streams

As data streams are gaining prominence in a growing number of emerging application domains, classification on data streams is becoming an active research area. Currently, the typical approach to this problem is based on ensemble learning, which learns basic classifiers from training data stream and forms the global predictor by organizing these basic ones. While this approach seems successful to some extent, its performance usually suffers from two contradictory elements existing naturally within many application scenarios: firstly, the need for gathering sufficient training data for basic classifiers and engaging enough basic learners in voting for bias-variance reduction; and secondly, the requirement for significant sensitivity to concept-drifts, which places emphasis on using recent training data and up-to-date individual classifiers. It results in such a dilemma that some algorithms are not sensitive enough to concept-drifts while others, although sensitive enough, suffer from unsatisfactory classification accuracy. In this paper, we propose an ensemble learning algorithm, which: (1) furnishes training data for basic classifiers, starting from the up-to-date data chunk and searching for complement from past chunks while ruling out the data inconsistent with current concept; (2) provides effective voting by adaptively distinguishing sensible classifiers from the else and engaging sensible ones as voters. Experimental results justify the superiority of this strategy in terms of both accuracy and sensitivity, especially in severe circumstances where training data is extremely insufficient or concepts are evolving frequently and significantly.

[1]  Wei Fan,et al.  Systematic data selection to mine concept-drifting data streams , 2004, KDD.

[2]  Philip S. Yu,et al.  Online Mining of Changes from Data Streams: Research Problems and Preliminary Results , 2003 .

[3]  Eric Bauer,et al.  An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants , 1999, Machine Learning.

[4]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[5]  Pat Langley,et al.  Estimating Continuous Distributions in Bayesian Classifiers , 1995, UAI.

[6]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[7]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[8]  Geoff Hulten,et al.  Mining time-changing data streams , 2001, KDD '01.

[9]  Philip S. Yu,et al.  Mining concept-drifting data streams using ensemble classifiers , 2003, KDD '03.

[10]  Gerhard Widmer,et al.  Learning in the Presence of Concept Drift and Hidden Contexts , 1996, Machine Learning.

[11]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[12]  Marcus A. Maloof,et al.  Dynamic weighted majority: a new ensemble method for tracking concept drift , 2003, Third IEEE International Conference on Data Mining.

[13]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[14]  Xindong Wu,et al.  Dynamic classifier selection for effective mining from noisy data streams , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[15]  Xindong Wu,et al.  Combining proactive and reactive predictions for data streams , 2005, KDD '05.

[16]  Philip S. Yu,et al.  On demand classification of data streams , 2004, KDD.

[17]  William Nick Street,et al.  A streaming ensemble algorithm (SEA) for large-scale classification , 2001, KDD '01.