Active learning for classifying data streams with unknown number of classes

The classification of data streams is an interesting but also a challenging problem. A data stream may grow infinitely making it impractical for storage prior to processing and classification. Due to its dynamic nature, the underlying distribution of the data stream may change over time resulting in the so-called concept drift or the possible emergence and fading of classes, known as concept evolution. In addition, acquiring labels of data samples in a stream is admittedly expensive if not infeasible at all. In this paper, we propose a novel stream-based active learning algorithm (SAL) which is capable of coping with both concept drift and concept evolution by adapting the classification model to the dynamic changes in the stream. SAL is the first AL algorithm in the literature to explicitly take account of these concepts. Moreover, using SAL, only labels of samples that are expected to reduce the expected future error are queried. This process is done while tackling the problem of sampling bias so that samples that induce the change (i.e., drifting samples or samples coming from new classes) are queried. To efficiently implement SAL, the paper proposes the application of non-parametric Bayesian models allowing to cope with the lack of prior knowledge about the data stream. In particular, Dirichlet mixture models and the stick breaking process are adopted and adapted to meet the requirements of online learning. The empirical results obtained on real-world benchmarks demonstrate the superiority of SAL in terms of classification performance over the state-of-the-art methods using average and average class accuracy.

[1]  Bala Srinivasan,et al.  AnyNovel: detection of novel concepts in evolving data streams , 2016, Evolving Systems.

[2]  Claude Sammut,et al.  Extracting Hidden Context , 1998, Machine Learning.

[3]  Bhavani M. Thuraisingham,et al.  Classification and Novel Class Detection in Concept-Drifting Data Streams under Time Constraints , 2011, IEEE Transactions on Knowledge and Data Engineering.

[4]  Paul Fearnhead,et al.  Particle filters for mixture models with an unknown number of components , 2004, Stat. Comput..

[5]  Jaime S. Cardoso,et al.  Learning from evolving video streams in a multi-camera scenario , 2015, Machine Learning.

[6]  Moamar Sayed Mouchaweh,et al.  A non-parametric hierarchical clustering model , 2015, 2015 IEEE International Conference on Evolving and Adaptive Intelligent Systems (EAIS).

[7]  Charu C. Aggarwal,et al.  Stream Classification with Recurring and Novel Class Detection Using Class-Based Ensemble , 2012, 2012 IEEE 12th International Conference on Data Mining.

[8]  Geoff Hulten,et al.  Mining time-changing data streams , 2001, KDD '01.

[9]  João Gama,et al.  Learning with Drift Detection , 2004, SBIA.

[10]  Abdelhamid Bouchachia,et al.  GT2FC: An Online Growing Interval Type-2 Self-Learning Fuzzy Classifier , 2014, IEEE Transactions on Fuzzy Systems.

[11]  Geoff Holmes,et al.  Active Learning With Drifting Streaming Data , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[12]  Brian Mac Namee,et al.  Handling Concept Drift in a Text Data Stream Constrained by High Labelling Cost , 2010, FLAIRS.

[13]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[14]  David A. Cohn,et al.  Active Learning with Statistical Models , 1996, NIPS.

[15]  Sanjoy Dasgupta,et al.  Two faces of active learning , 2011, Theor. Comput. Sci..

[16]  Yee Whye Teh,et al.  Dirichlet Process , 2017, Encyclopedia of Machine Learning and Data Mining.

[17]  Geoff Hulten,et al.  Mining high-speed data streams , 2000, KDD '00.

[18]  J. Sethuraman A CONSTRUCTIVE DEFINITION OF DIRICHLET PRIORS , 1991 .

[19]  Radford M. Neal Bayesian Mixture Modeling , 1992 .

[20]  Xin Yao,et al.  Online Ensemble Learning of Data Streams with Gradually Evolved Classes , 2016, IEEE Transactions on Knowledge and Data Engineering.

[21]  H. Sebastian Seung,et al.  Query by committee , 1992, COLT '92.

[22]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[23]  Arnold W. M. Smeulders,et al.  Active learning using pre-clustering , 2004, ICML.

[24]  Saso Dzeroski,et al.  Learning model trees from evolving data streams , 2010, Data Mining and Knowledge Discovery.

[25]  Latifur Khan,et al.  Facing the reality of data stream classification: coping with scarcity of labeled data , 2012, Knowledge and Information Systems.

[26]  J. Pitman,et al.  The two-parameter Poisson-Dirichlet distribution derived from a stable subordinator , 1997 .

[27]  Geoff Holmes,et al.  MOA: Massive Online Analysis , 2010, J. Mach. Learn. Res..

[28]  John Yen,et al.  Relevant data expansion for learning concept drift from sparsely labeled data , 2005, IEEE Transactions on Knowledge and Data Engineering.

[29]  Philip S. Yu,et al.  A framework for on-demand classification of evolving data streams , 2006, IEEE Transactions on Knowledge and Data Engineering.

[30]  Bhavani M. Thuraisingham,et al.  Classification and Novel Class Detection in Data Streams with Active Mining , 2010, PAKDD.

[31]  M. Escobar,et al.  Markov Chain Sampling Methods for Dirichlet Process Mixture Models , 2000 .

[32]  Shaogang Gong,et al.  Stream-Based Active Unusual Event Detection , 2010, ACCV.

[33]  João Gama,et al.  A survey on concept drift adaptation , 2014, ACM Comput. Surv..

[34]  Carl E. Rasmussen,et al.  The Infinite Gaussian Mixture Model , 1999, NIPS.

[35]  Kristen Grauman,et al.  Multi-Level Active Prediction of Useful Image Annotations for Recognition , 2008, NIPS.

[36]  Moamar Sayed-Mouchaweh,et al.  Handling Concept Drift , 2016 .

[37]  David D. Lewis,et al.  Heterogeneous Uncertainty Sampling for Supervised Learning , 1994, ICML.

[38]  Andrew McCallum,et al.  Toward Optimal Active Learning through Monte Carlo Estimation of Error Reduction , 2001, ICML 2001.

[39]  Charu C. Aggarwal,et al.  Data Streams - Models and Algorithms , 2014, Advances in Database Systems.

[40]  Lihong Li,et al.  Unbiased online active learning in data streams , 2011, KDD.

[41]  Moamar Sayed Mouchaweh,et al.  Dynamic supervised classification method for online monitoring in non-stationary environments , 2014, Neurocomputing.

[42]  Shaogang Gong,et al.  Stream-based joint exploration-exploitation active learning , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[43]  Yang Yu,et al.  Learning with Augmented Class by Exploiting Unlabeled Data , 2014, AAAI.

[44]  T. Ferguson A Bayesian Analysis of Some Nonparametric Problems , 1973 .

[45]  Xindong Wu,et al.  Combining proactive and reactive predictions for data streams , 2005, KDD '05.

[46]  D. Blackwell,et al.  Ferguson Distributions Via Polya Urn Schemes , 1973 .

[47]  Abdelhamid Bouchachia,et al.  Incremental learning with multi-level adaptation , 2011, Neurocomputing.

[48]  Charu C. Aggarwal,et al.  Addressing Concept-Evolution in Concept-Drifting Data Streams , 2010, 2010 IEEE International Conference on Data Mining.

[49]  John Langford,et al.  Agnostic Active Learning Without Constraints , 2010, NIPS.

[50]  Moamar Sayed Mouchaweh,et al.  A Bi-Criteria Active Learning Algorithm for Dynamic Data Streams , 2018, IEEE Trans. Neural Networks Learn. Syst..

[51]  Shonali Krishnaswamy,et al.  Mining data streams: a review , 2005, SGMD.

[52]  Tao Xiang,et al.  Finding Rare Classes: Active Learning with Generative and Discriminative Models , 2013, IEEE Transactions on Knowledge and Data Engineering.

[53]  John Langford,et al.  Importance weighted active learning , 2008, ICML '09.

[54]  Nicholas G. Polson,et al.  Particle learning for general mixtures , 2010 .

[55]  David A. Clifton,et al.  A review of novelty detection , 2014, Signal Process..