论文信息 - Probabilistic Active Learning in Datastreams

Probabilistic Active Learning in Datastreams

In recent years, stream-based active learning has become an intensively investigated research topic. In this work, we propose a new algorithm for stream-based active learning that decides immediately whether to acquire a label (selective sampling). To this purpose, we extend our pool-based Probabilistic Active Learning framework into a framework for streams. In particular, we complement the notion of usefulness within a topological space (“spatial usefulness”) with the concept of “temporal usefulness”. To actively select the instances, for which labels must be acquired, we introduce the Balanced Incremental Quantile Filter (BIQF), an algorithm that assesses the usefulness of instances in a sliding window, ensuring that the predefined budget restrictions will be met within a given tolerance window. We compare our approach to other active learning approaches for streams and show the competitiveness of our method.

Myra Spiliopoulou | Georg Krempl | Daniel Kottke

[1] Burr Settles,et al. Active Learning Literature Survey , 2009 .

[2] Myra Spiliopoulou,et al. Clustering-Based Optimised Probabilistic Active Learning (COPAL) , 2015, Discovery Science.

[3] Michael I. Jordan,et al. On Discriminative vs. Generative Classifiers: A comparison of logistic regression and naive Bayes , 2001, NIPS.

[4] Dino Ienco,et al. Clustering Based Active Learning for Evolving Data Streams , 2013, Discovery Science.

[5] Olivier Chapelle,et al. Active Learning for Parzen Window Classifier , 2005, AISTATS.

[6] Jiang Wang,et al. Feedback-driven multiclass active learning for data streams , 2013, CIKM.

[7] Eyke Hüllermeier,et al. Open challenges for data stream mining research , 2014, SKDD.

[8] Geoff Holmes,et al. Active Learning With Drifting Streaming Data , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[9] William A. Gale,et al. A sequential algorithm for training text classifiers , 1994, SIGIR '94.

[10] Andrew McCallum,et al. Toward Optimal Active Learning through Sampling Estimation of Error Reduction , 2001, ICML.

[11] Xiaodong Lin,et al. Active Learning From Stream Data Using Optimal Weight Classifier Ensemble , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).