Change with Delayed Labeling: When is it Detectable?

Handling changes over time in supervised learning (concept drift) lately has received a great deal of attention, a number of adaptive learning strategies have been developed. Most of them make an optimistic assumption that the new labels become available immediately. In real sequential classification tasks it is often unrealistic due to task specific delayed labeling or associated labeling costs. We address the problem of change detectability, given, that the new labels are not available. In this analytical study we look at the space of changes from probabilistic perspective to analyze, what changes are detectable without seeing the labels and what are not. We conduct a range of experiments with real life data with simulated and natural changes to explore this detectability issue. We propose a computationally friendly detection technique, which monitors a stream of classifier outputs. We demonstrate analytically and experimentally, what types of changes are possible to detect when the labels for the new data are not available.

[1]  Shai Ben-David,et al.  Detecting Change in Data Streams , 2004, VLDB.

[2]  João Gama,et al.  Learning with Drift Detection , 2004, SBIA.

[3]  Ralf Klinkenberg,et al.  Using Labeled and Unlabeled Data to Learn Drifting Concepts , 2007 .

[4]  Ricard Gavaldà,et al.  Learning from Time-Changing Data with Adaptive Windowing , 2007, SDM.

[5]  Xiaodong Lin,et al.  Active Learning from Data Streams , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[6]  Suresh Venkatasubramanian,et al.  Change (Detection) You Can Believe in: Finding Distributional Shifts in Data Streams , 2009, IDA.

[7]  H. Hotelling The Generalization of Student’s Ratio , 1931 .

[8]  Klaus-Robert Müller,et al.  Covariate Shift Adaptation by Importance Weighted Cross Validation , 2007, J. Mach. Learn. Res..

[9]  Sanjay Ranka,et al.  Statistical change detection for multi-dimensional data , 2007, KDD '07.

[10]  Philip S. Yu,et al.  Classification of changes in evolving data streams using online clustering result deviation , 2006 .

[11]  Nitesh V. Chawla,et al.  Adaptive Methods for Classification in Arbitrarily Imbalanced and Drifting Data Streams , 2009, PAKDD Workshops.

[12]  Brian Mac Namee,et al.  Handling Concept Drift in a Text Data Stream Constrained by High Labelling Cost , 2010, FLAIRS.

[13]  Arno Siebes,et al.  StreamKrimp: Detecting Change in Data Streams , 2008, ECML/PKDD.

[14]  G. Zech,et al.  New test for the multivariate two-sample problem based on the concept of minimum energy , 2003 .

[15]  J. Friedman,et al.  Multivariate generalizations of the Wald--Wolfowitz and Smirnov two-sample tests , 1979 .

[16]  Anton Dries,et al.  Adaptive concept drift detection , 2009 .

[17]  Latifur Khan,et al.  Lacking Labels in the Stream: Classifying Evolving Stream Data with Few Labels , 2009, ISMIS.

[18]  Charu C. Aggarwal,et al.  On change diagnosis in evolving data streams , 2005, IEEE Transactions on Knowledge and Data Engineering.

[19]  N. H. Anderson,et al.  Two-sample test statistics for measuring discrepancies between two multivariate probability density functions using kernel-based density estimates , 1994 .

[20]  M. Schilling Multivariate Two-Sample Tests Based on Nearest Neighbors , 1986 .

[21]  J. Wolfowitz,et al.  On a Test Whether Two Samples are from the Same Population , 1940 .

[22]  Alexey Tsymbal,et al.  The problem of concept drift: definitions and related work , 2004 .

[23]  Ludmila I. Kuncheva,et al.  Classifier Ensembles for Detecting Concept Change in Streaming Data: Overview and Perspectives , 2008 .

[24]  Yisheng Dong,et al.  An active learning system for mining time-changing data streams , 2007, Intell. Data Anal..

[25]  Žliobait . e,et al.  Learning under Concept Drift: an Overview , 2010 .

[26]  Hisashi Kashima,et al.  Unsupervised Change Analysis Using Supervised Learning , 2008, PAKDD.

[27]  Claude Sammut,et al.  Extracting Hidden Context , 1998, Machine Learning.

[28]  R. Bartoszynski,et al.  Reducing multidimensional two-sample data to one-dimensional interpoint comparisons , 1996 .

[29]  John Yen,et al.  Relevant data expansion for learning concept drift from sparsely labeled data , 2005, IEEE Transactions on Knowledge and Data Engineering.