An ensemble method for concept drift in nonstationary environment

Most statistical and data-mining algorithms assume that data come from a stationary distribution. However, in many real-world classification tasks, data arrive over time and the target concept to be learned from the data stream may change accordingly. Many algorithms have been proposed for learning drifting concepts. To deal with the problem of learning when the distribution generating the data changes over time, dynamic weighted majority was proposed as an ensemble method for concept drift. Unfortunately, this technique considers neither the age of the classifiers in the ensemble nor their past correct classification. In this paper, we propose a method that takes into account expert's age as well as its contribution to the global algorithm's accuracy. We evaluate the effectiveness of our proposed method by using m classifiers and training a collection of n-fold partitioning of the data. Experimental results on a benchmark data set show that our method outperforms existing ones.

[1]  Daniel Nikovski,et al.  Fast adaptive algorithms for abrupt change detection , 2009, Machine Learning.

[2]  Tom M. Mitchell,et al.  Experience with a learning personal assistant , 1994, CACM.

[3]  Philip S. Yu,et al.  A General Framework for Mining Concept-Drifting Data Streams with Skewed Distributions , 2007, SDM.

[4]  Manfred K. Warmuth,et al.  The Weighted Majority Algorithm , 1994, Inf. Comput..

[5]  R. Polikar,et al.  Ensemble based systems in decision making , 2006, IEEE Circuits and Systems Magazine.

[6]  Robi Polikar,et al.  Incremental learning in non-stationary environments with concept drift using a multiple classifier based approach , 2008, 2008 19th International Conference on Pattern Recognition.

[7]  Avrim Blum,et al.  Empirical Support for Winnow and Weighted-Majority Algorithms: Results on a Calendar Scheduling Domain , 2004, Machine Learning.

[8]  Marcus A. Maloof,et al.  Dynamic Weighted Majority: An Ensemble Method for Drifting Concepts , 2007, J. Mach. Learn. Res..

[9]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[10]  Geoff Hulten,et al.  Mining time-changing data streams , 2001, KDD '01.

[11]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[12]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[13]  Denis Larocque,et al.  An empirical comparison of ensemble methods based on classification trees , 2003 .

[14]  Milos Hauskrecht,et al.  Learning to detect incidents from noisily labeled data , 2009, Machine Learning.