Streaming Active Deep Forest for Evolving Data Stream Classification

In recent years, Deep Neural Networks (DNNs) have gained progressive momentum in many areas of machine learning. The layer-by-layer process of DNNs has inspired the development of many deep models, including deep ensembles. The most notable deep ensemble-based model is Deep Forest, which can achieve highly competitive performance while having much fewer hyper-parameters comparing to DNNs. In spite of its huge success in the batch learning setting, no effort has been made to adapt Deep Forest to the context of evolving data streams. In this work, we introduce the Streaming Deep Forest (SDF) algorithm, a high-performance deep ensemble method specially adapted to stream classification. We also present the Augmented Variable Uncertainty (AVU) active learning strategy to reduce the labeling cost in the streaming context. We compare the proposed methods to state-of-the-art streaming algorithms in a wide range of datasets. The results show that by following the AVU active learning strategy, SDF with only 70\% of labeling budget significantly outperforms other methods trained with all instances.

[1]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[2]  Wenhua Xu,et al.  Active learning over evolving data streams using paired ensemble framework , 2016, 2016 Eighth International Conference on Advanced Computational Intelligence (ICACI).

[3]  U. M. Feyyad Data mining and knowledge discovery: making sense out of data , 1996 .

[4]  Stuart J. Russell,et al.  Experimental comparisons of online and batch versions of bagging and boosting , 2001, KDD '01.

[5]  M. Panella Associate Editor of the Journal of Computer and System Sciences , 2014 .

[6]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[7]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[8]  Geoff Holmes,et al.  Active Learning with Evolving Streaming Data , 2011, ECML/PKDD.

[9]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[10]  Geoff Hulten,et al.  Mining high-speed data streams , 2000, KDD '00.

[11]  Chid Apte,et al.  Proceedings of the 2007 SIAM International Conference on Data Mining , 2007 .

[12]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[13]  Rocco A. Servedio,et al.  Smooth Boosting and Learning with Malicious Noise , 2001, J. Mach. Learn. Res..

[14]  Talel Abdessalem,et al.  Adaptive random forests for evolving data stream classification , 2017, Machine Learning.

[15]  Dock Bumpers,et al.  Volume 2 , 2005, Proceedings of the Ninth International Conference on Computer Supported Cooperative Work in Design, 2005..

[16]  Dirk Van,et al.  Ensemble Methods: Foundations and Algorithms , 2012 .

[17]  Shuliang Wang,et al.  Data Mining and Knowledge Discovery , 2005, Mathematical Principles of the Internet.

[18]  Rocco A. Servedio,et al.  Smooth boosting and learning with malicious noise , 2003 .

[19]  Steven C. H. Hoi,et al.  Online Deep Learning: Learning Deep Neural Networks on the Fly , 2017, IJCAI.

[20]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[21]  Ji Feng,et al.  Deep Forest: Towards An Alternative to Deep Neural Networks , 2017, IJCAI.

[22]  R. Rosenfeld Nature , 2009, Otolaryngology--head and neck surgery : official journal of American Academy of Otolaryngology-Head and Neck Surgery.

[23]  Ricard Gavaldà,et al.  Learning from Time-Changing Data with Adaptive Windowing , 2007, SDM.

[24]  Gilles Louppe,et al.  Ensembles on Random Patches , 2012, ECML/PKDD.

[25]  Geoffrey I. Webb,et al.  Characterizing concept drift , 2015, Data Mining and Knowledge Discovery.

[26]  Claudio Gentile,et al.  Worst-Case Analysis of Selective Sampling for Linear Classification , 2006, J. Mach. Learn. Res..