Tracking Recurring Concepts with Meta-learners

This work address data stream mining from dynamic environments where the distribution underlying the observations may change over time. In these contexts, learning algorithms must be equipped with change detection mechanisms. Several methods have been proposed able to detect and react to concept drift. When a drift is signaled, most of the approaches use a forgetting mechanism, by releasing the current model, and start learning a new decision model, Nevertheless, it is not rare for the concepts from history to reappear, for example seasonal changes. In this work we present method that memorizes learnt decision models whenever a concept drift is signaled. The system uses meta-learning techniques that characterize the domain of applicability of previous learnt models. The meta-learner can detect re-occurrence of contexts and take pro-active actions by activating previous learnt models. The main benefit of this approach is that the proposed meta-learner is capable of selecting similar historical concept, if there is one, without the knowledge of true classes of examples.

[1]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[2]  Frank Kirchner,et al.  Performance evaluation of EANT in the robocup keepaway benchmark , 2007, ICMLA 2007.

[3]  Richard Granger,et al.  Incremental Learning from Noisy Data , 1986, Machine Learning.

[4]  Alexey Tsymbal,et al.  The problem of concept drift: definitions and related work , 2004 .

[5]  Alessandra Russo,et al.  Advances in Artificial Intelligence – SBIA 2004 , 2004, Lecture Notes in Computer Science.

[6]  Mohamed Medhat Gaber,et al.  Learning from Data Streams: Processing Techniques in Sensor Networks , 2007 .

[7]  Raj Bhatnagar,et al.  Tracking recurrent concept drift in streaming data using ensemble classifiers , 2007, Sixth International Conference on Machine Learning and Applications (ICMLA 2007).

[8]  Salvatore J. Stolfo,et al.  Cost-based modeling for fraud and intrusion detection: results from the JAM project , 2000, Proceedings DARPA Information Survivability Conference and Exposition. DISCEX'00.

[9]  João Gama,et al.  Decision trees for mining data streams , 2006, Intell. Data Anal..

[10]  Mihai Lazarescu,et al.  A Multi-Resolution Learning Approach to Tracking Concept Drift and Recurrent Concepts , 2005, PRIS.

[11]  Xindong Wu,et al.  Combining proactive and reactive predictions for data streams , 2005, KDD '05.

[12]  GamaJoão,et al.  Decision trees for mining data streams , 2006 .

[13]  Kenneth O. Stanley Learning Concept Drift with a Committee of Decision Trees , 2003 .

[14]  Albert Bifet,et al.  Massive Online Analysis , 2009 .

[15]  Grigorios Tsoumakas,et al.  An Ensemble of Classifiers for coping with Recurring Contexts in Data Streams , 2008, ECAI.

[16]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[17]  Xindong Wu,et al.  Mining in Anticipation for Concept Change: Proactive-Reactive Prediction in Data Streams , 2006, Data Mining and Knowledge Discovery.

[18]  Philip S. Yu,et al.  Mining concept-drifting data streams using ensemble classifiers , 2003, KDD '03.

[19]  Mohamed Medhat Gaber,et al.  Knowledge Discovery from Sensor Data , 2008 .

[20]  William Nick Street,et al.  A streaming ensemble algorithm (SEA) for large-scale classification , 2001, KDD '01.

[21]  João Gama,et al.  Learning with Drift Detection , 2004, SBIA.