Online adaptive decision trees based on concentration inequalities

Classification trees are a powerful tool for mining non-stationary data streams. In these situations, massive data are constantly generated at high speed and the underlying target function can change over time. The iadem family of algorithms is based on Hoeffding's and Chernoff's bounds and induces online decision trees from data streams, but is not able to handle concept drift. This study extends this family to deal with time-changing data streams. The new online algorithm, named iadem-3, performs two main actions in response to a concept drift. Firstly, it resets the variables affected by the change and maintains unbroken the structure of the tree, which allows for changes in which consecutive target functions are very similar. Secondly, it creates alternative models that replace parts of the main tree when they significantly improve the accuracy of the model, thereby rebuilding the main tree if needed. An online change detector and a non-parametric statistical test based on Hoeffding's bounds are used to guarantee this significance. A new pruning method is also incorporated in iadem-3, making sure that all split tests previously installed in decision nodes are useful. The learning model is also viewed as an ensemble of classifiers, and predictions of the main and alternative models are combined to classify unlabeled examples. iadem-3 is empirically compared with various well-known decision tree induction algorithms for concept drift detection. We empirically show that our new algorithm often reaches higher levels of accuracy with smaller decision tree models, maintaining the processing time bounded, irrespective of the number of instances processed.

[1]  Wei-Pang Yang,et al.  An Efficient and Sensitive Decision Tree Approach to Mining Concept-Drifting Data Streams , 2008, Informatica.

[2]  João Gama,et al.  On evaluating stream learning algorithms , 2012, Machine Learning.

[3]  João Gama,et al.  Issues in evaluation of stream learning algorithms , 2009, KDD.

[4]  Gerhard Widmer,et al.  Effective Learning in Dynamic Environments by Explicit Context Tracking , 1993, ECML.

[5]  Albert Bifet,et al.  Adaptive Stream Mining: Pattern Learning and Mining from Evolving Data Streams , 2010, Frontiers in Artificial Intelligence and Applications.

[6]  João Gama,et al.  Learning with Local Drift Detection , 2006, ADMA.

[7]  José del Campo-Ávila,et al.  Online and Non-Parametric Drift Detection Methods Based on Hoeffding’s Bounds , 2015, IEEE Transactions on Knowledge and Data Engineering.

[8]  Bernard Zenko,et al.  Speeding-Up Hoeffding-Based Regression Trees With Options , 2011, ICML.

[9]  João Gama,et al.  A survey on concept drift adaptation , 2014, ACM Comput. Surv..

[10]  Geoff Holmes,et al.  Ensembles of Restricted Hoeffding Trees , 2012, TIST.

[11]  A. Bifet,et al.  Early Drift Detection Method , 2005 .

[12]  A. P. Dawid,et al.  Present position and potential developments: some personal views , 1984 .

[13]  W. Hoeffding Probability Inequalities for sums of Bounded Random Variables , 1963 .

[14]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[15]  Hao Wang,et al.  Learning concept-drifting data streams with random ensemble decision trees , 2015, Neurocomputing.

[16]  Rafael Morales Bueno,et al.  Learning in Environments with Unknown Dynamics: Towards more Robust Concept Learners , 2007, J. Mach. Learn. Res..

[17]  Wray L. Buntine,et al.  Learning classification trees , 1992 .

[18]  Piotr Duda,et al.  Decision Trees for Mining Data Streams Based on the McDiarmid's Bound , 2013, IEEE Transactions on Knowledge and Data Engineering.

[19]  Shai Ben-David,et al.  Detecting Change in Data Streams , 2004, VLDB.

[20]  Pedro M. Domingos,et al.  On the Optimality of the Simple Bayesian Classifier under Zero-One Loss , 1997, Machine Learning.

[21]  Philip S. Yu,et al.  Mining concept-drifting data streams using ensemble classifiers , 2003, KDD '03.

[22]  João Gama,et al.  Evaluating algorithms that learn from data streams , 2009, SAC '09.

[23]  Urszula Boryczka,et al.  Multiple Boosting in the Ant Colony Decision Forest meta-classifier , 2015, Knowl. Based Syst..

[24]  José del Campo-Ávila,et al.  Incremental Algorithm Driven by Error Margins , 2006, Discovery Science.

[25]  Marcus A. Maloof,et al.  Paired Learners for Concept Drift , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[26]  Sreerama K. Murthy,et al.  Automatic Construction of Decision Trees from Data: A Multi-Disciplinary Survey , 1998, Data Mining and Knowledge Discovery.

[27]  Geoff Hulten,et al.  Mining time-changing data streams , 2001, KDD '01.

[28]  João Gama,et al.  Learning with Drift Detection , 2004, SBIA.

[29]  Geoff Holmes,et al.  New Options for Hoeffding Trees , 2007, Australian Conference on Artificial Intelligence.

[30]  Geoff Holmes,et al.  MOA: Massive Online Analysis , 2010, J. Mach. Learn. Res..

[31]  João Gama,et al.  Decision trees for mining data streams , 2006, Intell. Data Anal..

[32]  Douglas C. Montgomery,et al.  Introduction to Statistical Quality Control , 1986 .

[33]  Grigorios Tsoumakas,et al.  An Ensemble of Classifiers for coping with Recurring Contexts in Data Streams , 2008, ECAI.

[34]  Ricard Gavaldà,et al.  Adaptive Learning from Evolving Data Streams , 2009, IDA.

[35]  Ricard Gavaldà,et al.  Learning from Time-Changing Data with Adaptive Windowing , 2007, SDM.

[36]  Dimitris K. Tasoulis,et al.  Exponentially weighted moving average charts for detecting concept drift , 2012, Pattern Recognit. Lett..

[37]  José del Campo-Ávila,et al.  Improving the performance of an incremental algorithm driven by error margins , 2008, Intell. Data Anal..

[38]  Geoff Hulten,et al.  Mining high-speed data streams , 2000, KDD '00.

[39]  Saso Dzeroski,et al.  Learning model trees from evolving data streams , 2010, Data Mining and Knowledge Discovery.

[40]  Gonzalo Ramos-Jiménez,et al.  Fast Adapting Ensemble: A New Algorithm for Mining Data Streams with Concept Drift , 2015, TheScientificWorldJournal.

[41]  Philip S. Yu,et al.  A General Framework for Mining Concept-Drifting Data Streams with Skewed Distributions , 2007, SDM.

[42]  João Gama,et al.  Learning from Data Streams , 2009, Encyclopedia of Data Warehousing and Mining.

[43]  João Gama,et al.  Learning decision trees from dynamic data streams , 2005, SAC '05.

[44]  Marcos Salganicoff,et al.  Tolerating Concept and Sampling Shift in Lazy Learning Using Prediction Error Context Switching , 1997, Artificial Intelligence Review.

[45]  Albert Carles Bifet Figuerol,et al.  Adaptive parameter-free learning from evolving data streams , 2009 .