Streaming Random Patches for Evolving Data Stream Classification

Ensemble methods are a popular choice for learning from evolving data streams. This popularity is due to (i) the ability to simulate simple, yet, successful ensemble learning strategies, such as bagging and random forests; (ii) the possibility of incorporating drift detection and recovery in conjunction to the ensemble algorithm; (iii) the availability of efficient incremental base learners, such as Hoeffding Trees. In this work, we introduce the Streaming Random Patches (SRP) algorithm, an ensemble method specially adapted to stream classification which combines random subspaces and online bagging. We provide theoretical insights and empirical results illustrating different aspects of SRP. In particular, we explain how the widely adopted incremental Hoeffding trees are not, in fact, unstable learners, unlike their batch counterparts, and how this fact significantly influences ensemble methods design and performance. We compare SRP against state-of-the-art ensemble variants for streaming data in a multitude of datasets. The results show how SRP produce a high predictive performance for both real and synthetic datasets. Besides, we analyze the diversity over time and the average tree depth, which provides insights on the differences between local subspace randomization (as in random forest) and global subspace randomization (as in random subspaces).

[1]  Gilles Louppe,et al.  Ensembles on Random Patches , 2012, ECML/PKDD.

[2]  R. Durrant,et al.  Linear dimensionality reduction in linear time: Johnson-Lindenstrauss-type guarantees for random subspace , 2017, 1705.06408.

[3]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[4]  Gerhard Widmer,et al.  Learning in the Presence of Concept Drift and Hidden Contexts , 1996, Machine Learning.

[5]  Indre Zliobaite,et al.  Change with Delayed Labeling: When is it Detectable? , 2010, 2010 IEEE International Conference on Data Mining Workshops.

[6]  Xin Yao,et al.  The Impact of Diversity on Online Ensemble Learning in the Presence of Concept Drift , 2010, IEEE Transactions on Knowledge and Data Engineering.

[7]  Pedro M. Domingos A Unified Bias-Variance Decomposition for Zero-One and Squared Loss , 2000, AAAI/IAAI.

[8]  Geoff Holmes,et al.  MOA: Massive Online Analysis , 2010, J. Mach. Learn. Res..

[9]  Stuart J. Russell,et al.  Online bagging and boosting , 2005, 2005 IEEE International Conference on Systems, Man and Cybernetics.

[10]  Ricard Gavaldà,et al.  Learning from Time-Changing Data with Adaptive Windowing , 2007, SDM.

[11]  Geoff Holmes,et al.  Leveraging Bagging for Evolving Data Streams , 2010, ECML/PKDD.

[12]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[13]  André Elisseeff,et al.  Stability and Generalization , 2002, J. Mach. Learn. Res..

[14]  Nitesh V. Chawla,et al.  Heuristic Updatable Weighted Random Subspaces for Non-stationary Environments , 2011, 2011 IEEE 11th International Conference on Data Mining.

[15]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Geoff Hulten,et al.  Mining high-speed data streams , 2000, KDD '00.

[17]  Marcus A. Maloof,et al.  Dynamic weighted majority: a new ensemble method for tracking concept drift , 2003, Third IEEE International Conference on Data Mining.

[18]  Manfred K. Warmuth,et al.  The Weighted Majority Algorithm , 1994, Inf. Comput..

[19]  Saso Dzeroski,et al.  Combining Bagging and Random Subspaces to Create Better Ensembles , 2007, IDA.

[20]  Geoff Holmes,et al.  Ensembles of Restricted Hoeffding Trees , 2012, TIST.

[21]  Juan José Rodríguez Diez,et al.  Random Subspace Ensembles for fMRI Classification , 2010, IEEE Transactions on Medical Imaging.

[22]  Hsuan-Tien Lin,et al.  An Online Boosting Algorithm with Theoretical Justifications , 2012, ICML.

[23]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[24]  Rocco A. Servedio,et al.  Smooth Boosting and Learning with Malicious Noise , 2001, J. Mach. Learn. Res..

[25]  Ludmila I. Kuncheva,et al.  Naive random subspace ensemble with linear classifiers for real-time classification of fMRI data , 2012, Pattern Recognit..

[26]  Talel Abdessalem,et al.  Adaptive random forests for evolving data stream classification , 2017, Machine Learning.

[27]  Geoff Holmes,et al.  Stress-Testing Hoeffding Trees , 2005, PKDD.

[28]  Yi Lin,et al.  Random Forests and Adaptive Nearest Neighbors , 2006 .

[29]  Saso Dzeroski,et al.  Learning model trees from evolving data streams , 2010, Data Mining and Knowledge Discovery.

[30]  Jerzy Stefanowski,et al.  Combining block-based and online methods in learning ensembles from concept drifting data streams , 2014, Inf. Sci..

[31]  Indre liobaite,et al.  Change with Delayed Labeling: When is it Detectable? , 2010, ICDM 2010.

[32]  João Gama,et al.  A survey on concept drift adaptation , 2014, ACM Comput. Surv..

[33]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[34]  Ludmila I. Kuncheva,et al.  That Elusive Diversity in Classifier Ensembles , 2003, IbPRIA.

[35]  Geoffrey I. Webb,et al.  Characterizing concept drift , 2015, Data Mining and Knowledge Discovery.

[36]  David B. Skillicorn,et al.  Classifying Evolving Data Streams Using Dynamic Streaming Random Forests , 2008, DEXA.

[37]  Leo Breiman,et al.  Pasting Small Votes for Classification in Large Databases and On-Line , 1999, Machine Learning.

[38]  Jean Paul Barddal,et al.  A Survey on Ensemble Learning for Data Stream Classification , 2017, ACM Comput. Surv..

[39]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[40]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[41]  Wu He,et al.  Internet of Things in Industries: A Survey , 2014, IEEE Transactions on Industrial Informatics.