Tracking Drifting Concepts by Time Window Optimisation

This paper addresses the task of learning concept descriptions from streams of data. As new data are obtained the concept description has to be updated regularly to include the new data. In this case we can face the problem that the concept changes over time. Hence the old data become irrelevant to the current concept and have to be removed from the training dataset. This problem is known in the area of machine learning as concept drift. We develop a mechanism that tracks changing concepts using an adaptive time window. The method uses a significance test to detect concept drift and then optimizes the size of the time window, aiming to maximise the classification accuracy on recent data. The method presented is general in nature and can be used with any learning algorithm. The method is tested with three standard learning algorithms (kNN, ID3 and NBC). Three datasets have been used in these experiments. The experimental results provide evidence that the suggested forgetting mechanism is able significantly to improve predictive accuracy on changing concepts.

[1]  Ivan Koychev,et al.  Gradual Forgetting for Adaptation to Concept Drift , 2000 .

[2]  Claude Sammut,et al.  Extracting Hidden Context , 1998, Machine Learning.

[3]  Matjaz Kukar,et al.  Drifting Concepts as Hidden Factors in Clinical Studies , 2003, AIME.

[4]  Ralf Klinkenberg,et al.  Learning drifting concepts: Example selection vs. example weighting , 2004, Intell. Data Anal..

[5]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[6]  Gerhard Widmer,et al.  Tracking Context Changes through Meta-Learning , 1997, Machine Learning.

[7]  Ryszard S. Michalski,et al.  Selecting Examples for Partial Memory Learning , 2000, Machine Learning.

[8]  Keith W. Miller,et al.  How good is good enough?: an ethical analysis of software construction and use , 1994, CACM.

[9]  João Gama,et al.  Learning with Drift Detection , 2004, SBIA.

[10]  Padraig Cunningham,et al.  A case-based technique for tracking concept drift in spam filtering , 2004, Knowl. Based Syst..

[11]  Ivan Koychev Tracking Changing User Interests through Meta-Leaning of Context , 2002 .

[12]  Svetha Venkatesh,et al.  Using multiple windows to track concept drift , 2004, Intell. Data Anal..

[13]  Alexander Graham,et al.  Introduction to Control Theory, Including Optimal Control , 1980 .

[14]  Elpida T. Keravnou,et al.  Artificial Intelligence in Medicine: 9th Conference on Artificial Intelligence in Medicine in Europe, Aime 2003, Protaras, Cyprus, October 18-22, 2003, Proceedings (Lecture Notes in Computer Science, 2780.) , 2003 .

[15]  Carlo Zaniolo,et al.  Fast and Light Boosting for Adaptive Mining of Data Streams , 2004, PAKDD.

[16]  Richard Granger,et al.  Incremental Learning from Noisy Data , 1986, Machine Learning.

[17]  Gerhard Widmer,et al.  Learning in the Presence of Concept Drift and Hidden Contexts , 1996, Machine Learning.

[18]  Moni Naor,et al.  Adaptive Hypermedia and Adaptive Web-Based Systems , 2004, Lecture Notes in Computer Science.

[19]  David W. Aha,et al.  Instance-Based Learning Algorithms , 1991, Machine Learning.

[20]  Tom M. Mitchell,et al.  Experience with a learning personal assistant , 1994, CACM.