Using Control Charts for Detecting Concept Change in Streaming Data

We address adaptive online classification in the presence of concept change. An overview of the machine learning approaches reveals a deficit of methods for explicit detection of change when the only information is the classification error of the streaming data. We look to borrow answers from the longstanding research in monitoring process quality by using control charts. Four methods for change detection are detailed and compared in the paper: two from the machine learning literature and two control charts (Shewhart and Sequential Probability Ratio Test (SPRT)). Control charts would only signal a change. To examine empirically their effect on the classification accuracy, the Shewhart and SPRT methods were equipped with a window resizing heuristic. We chose to grow the window until a change has been detected, and shrink it to a batch size upon detection. Experiments with 28 real data sets were carried out, where change was simulated by swapping class labels. Paired t-test on the classification error and paired Wilcoxon signed rank test picked SPRT as the best change detection method.

[1]  E. S. Page CONTINUOUS INSPECTION SCHEMES , 1954 .

[2]  D. A. Evans,et al.  An approach to the probability distribution of cusum run length , 1972 .

[3]  A. F. Bissell,et al.  The Performance of Control Charts and Cusums Under Linear Trend , 1984 .

[4]  N. Littlestone Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[5]  Marcos Salganicoff,et al.  Density-Adaptive Learning and Forgetting , 1993, ICML.

[6]  Avrim Blum,et al.  Empirical Support for Winnow and Weighted-Majority Based Algorithms: Results on a Calendar Scheduling Domain , 1995, ICML.

[7]  M. Harries Detecting Concept Drift in Financial Time Series Prediction using Symbolic Machine Learning , 1995 .

[8]  Ingrid Renz,et al.  Adaptive Information Filtering: Learning in the Presence of Concept Drifts , 1998 .

[9]  M. R. Reynolds,et al.  The SPRT chart for monitoring a proportion , 1998 .

[10]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[11]  Michaela M. Black,et al.  Maintaining the performance of a learned classifier under concept drift , 1999, Intell. Data Anal..

[12]  CESAR A. Acosta-Mejia,et al.  Improved p charts to monitor process quality , 1999 .

[13]  Salvatore J. Stolfo,et al.  The application of AdaBoost for distributed, scalable and on-line learning , 1999, KDD '99.

[14]  Michaela M. Black,et al.  Refined Time Stamps for Concept Drift Detection During Mining for Classification Rules , 2000, TSDM.

[15]  Thorsten Joachims,et al.  Detecting Concept Drift with Support Vector Machines , 2000, ICML.

[16]  M. R. Reynolds,et al.  A general approach to modeling CUSUM charts for a proportion , 2000 .

[17]  Geoff Hulten,et al.  Mining time-changing data streams , 2001, KDD '01.

[18]  Vasant Honavar,et al.  Learn++: an incremental learning algorithm for supervised neural networks , 2001, IEEE Trans. Syst. Man Cybern. Part C.

[19]  William Nick Street,et al.  A streaming ensemble algorithm (SEA) for large-scale classification , 2001, KDD '01.

[20]  Johannes Gehrke,et al.  Mining data streams under block evolution , 2002, SKDD.

[21]  Chris Mesterharm,et al.  Tracking Linear-threshold Concepts with Winnow , 2003, J. Mach. Learn. Res..

[22]  Mihai Lazarescu,et al.  Using selective memory to track concept drift effectively , 2003 .

[23]  Sameer Singh,et al.  Novelty detection: a review - part 1: statistical approaches , 2003, Signal Process..

[24]  Galit Shmueli,et al.  A unified Markov chain approach for computing the run length distribution in control charts with simple or compound rules , 2003 .

[25]  Kenneth O. Stanley Learning Concept Drift with a Committee of Decision Trees , 2003 .

[26]  Philip S. Yu,et al.  Mining concept-drifting data streams using ensemble classifiers , 2003, KDD '03.

[27]  Marcus A. Maloof,et al.  Dynamic weighted majority: a new ensemble method for tracking concept drift , 2003, Third IEEE International Conference on Data Mining.

[28]  Alexey Tsymbal,et al.  The problem of concept drift: definitions and related work , 2004 .

[29]  João Gama,et al.  Learning with Drift Detection , 2004, SBIA.

[30]  D. Kibler,et al.  Instance-based learning algorithms , 2004, Machine Learning.

[31]  Gerhard Widmer,et al.  Tracking Context Changes through Meta-Learning , 1997, Machine Learning.

[32]  A. Bifet,et al.  Early Drift Detection Method , 2005 .

[33]  Subhash C. Bagui,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2005, Technometrics.

[34]  Gerhard Widmer,et al.  Learning in the Presence of Concept Drift and Hidden Contexts , 1996, Machine Learning.

[35]  David J. Hand,et al.  Classifier Technology and the Illusion of Progress , 2006, math/0606441.

[36]  Padraig Cunningham,et al.  A Comparison of Ensemble and Case-Base Maintenance Techniques for Handling Concept Drift in Spam Filtering , 2006, FLAIRS.

[37]  George Forman,et al.  Tackling concept drift by temporal inductive transfer , 2006, SIGIR.

[38]  Ralf Klinkenberg,et al.  Using Labeled and Unlabeled Data to Learn Drifting Concepts , 2007 .

[39]  Stefan H. Steiner,et al.  Grouped data exponentially weighted moving average control charts , 2008 .