Very fast decision rules for classification in data streams

Data stream mining is the process of extracting knowledge structures from continuous, rapid data records. Many decision tasks can be formulated as stream mining problems and therefore many new algorithms for data streams are being proposed. Decision rules are one of the most interpretable and flexible models for predictive data mining. Nevertheless, few algorithms have been proposed in the literature to learn rule models for time-changing and high-speed flows of data. In this paper we present the very fast decision rules (VFDR) algorithm and discuss interesting extensions to the base version. All the proposed versions are one-pass and any-time algorithms. They work on-line and learn ordered or unordered rule sets. Algorithms designed to work with data streams should be able to detect changes and quickly adapt the decision model. In order to manage these situations we also present the adaptive extension (AVFDR) to detect changes in the process generating data and adapt the decision model. Detecting local drifts takes advantage of the modularity of the rule sets. In AVFDR, each individual rule monitors the evolution of performance metrics to detect concept drift. AVFDR prunes rules whenever a drift is signaled. This explicit change detection mechanism provides useful information about the dynamics of the process generating data, faster adaptation to changes and generates more compact rule sets. The experimental evaluation demonstrates that algorithms achieve competitive results in comparison to alternative methods and the adaptive methods are able to learn fast and compact rule sets from evolving streams.

[1]  金田 重郎,et al.  C4.5: Programs for Machine Learning (書評) , 1995 .

[2]  R. Rivest Learning Decision Lists , 1987, Machine Learning.

[3]  Johannes Fürnkranz,et al.  Foundations of Rule Learning , 2012, Cognitive Technologies.

[4]  Geoff Hulten,et al.  Mining time-changing data streams , 2001, KDD '01.

[5]  João Gama,et al.  Learning with Drift Detection , 2004, SBIA.

[6]  M. Friedman The Use of Ranks to Avoid the Assumption of Normality Implicit in the Analysis of Variance , 1937 .

[7]  João Gama,et al.  Issues in evaluation of stream learning algorithms , 2009, KDD.

[8]  Ronald L. Rivest,et al.  Learning decision lists , 2004, Machine Learning.

[9]  A. Bifet,et al.  Early Drift Detection Method , 2005 .

[10]  Thorsten Meinl,et al.  KNIME - the Konstanz information miner: version 2.0 and beyond , 2009, SKDD.

[11]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[12]  Richard Granger,et al.  Incremental Learning from Noisy Data , 1986, Machine Learning.

[13]  J. C. Schlimmer,et al.  Incremental learning from noisy data , 2004, Machine Learning.

[14]  Ryszard S. Michalski,et al.  Incremental learning with partial instance memory , 2002, Artif. Intell..

[15]  Paulo Cortez,et al.  Using data mining for bank direct marketing: an application of the CRISP-DM methodology , 2011 .

[16]  Grigorios Tsoumakas,et al.  An adaptive personalized news dissemination system , 2009, Journal of Intelligent Information Systems.

[17]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[18]  Jesús S. Aguilar-Ruiz,et al.  Knowledge discovery from data streams , 2009, Intell. Data Anal..

[19]  Geoff Holmes,et al.  New ensemble methods for evolving data streams , 2009, KDD.

[20]  Stuart J. Russell,et al.  Online bagging and boosting , 2005, 2005 IEEE International Conference on Systems, Man and Cybernetics.

[21]  Vipin Kumar,et al.  Chapman & Hall/CRC Data Mining and Knowledge Discovery Series , 2008 .

[22]  João Gama,et al.  Accurate decision trees for mining high-speed data streams , 2003, KDD '03.

[23]  J. Ross Quinlan,et al.  Determinate Literals in Inductive Logic Programming , 1991, IJCAI.

[24]  M. Harries SPLICE-2 Comparative Evaluation: Electricity Pricing , 1999 .

[25]  Geoff Hulten,et al.  Mining high-speed data streams , 2000, KDD '00.

[26]  Marcus A. Maloof,et al.  Dynamic weighted majority: a new ensemble method for tracking concept drift , 2003, Third IEEE International Conference on Data Mining.

[27]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[28]  Sholom M. Weiss,et al.  Predictive data mining - a practical guide , 1997 .

[29]  Gerhard Widmer,et al.  Learning in the Presence of Concept Drift and Hidden Contexts , 1996, Machine Learning.

[30]  João Gama,et al.  Decision trees for mining data streams , 2006, Intell. Data Anal..

[31]  Philip S. Yu,et al.  Mining concept-drifting data streams using ensemble classifiers , 2003, KDD '03.

[32]  D. Hinkley Inference about the change-point from cumulative sum tests , 1971 .

[33]  Ian H. Witten,et al.  Generating Accurate Rule Sets Without Global Optimization , 1998, ICML.

[34]  Jesús S. Aguilar-Ruiz,et al.  Incremental Rule Learning and Border Examples Selection from Numerical Data Streams , 2005, J. Univers. Comput. Sci..

[35]  William Nick Street,et al.  A streaming ensemble algorithm (SEA) for large-scale classification , 2001, KDD '01.

[36]  João Gama,et al.  Handling Time Changing Data with Adaptive Very Fast Decision Rules , 2012, ECML/PKDD.

[37]  Geoff Holmes,et al.  MOA: Massive Online Analysis , 2010, J. Mach. Learn. Res..

[38]  João Gama,et al.  Learning Decision Rules from Data Streams , 2011, IJCAI.

[39]  Ricard Gavaldà,et al.  Adaptive Learning from Evolving Data Streams , 2009, IDA.

[40]  Peter Clark,et al.  The CN2 Induction Algorithm , 1989, Machine Learning.

[41]  Johannes Fürnkranz,et al.  Round Robin Rule Learning , 2001, ICML.

[42]  Pedro M. Domingos,et al.  Unifying Instance-Based and Rule-Based Induction , 1996 .

[43]  Ralf Klinkenberg,et al.  Learning drifting concepts: Example selection vs. example weighting , 2004, Intell. Data Anal..

[44]  Amit Mitra,et al.  Statistical Quality Control , 2002, Technometrics.

[45]  Henrik Boström,et al.  Resolving rule conflicts with double induction , 2004, Intell. Data Anal..

[46]  Pedro M. Domingos Unifying Instance-Based and Rule-Based Induction , 1996, Machine Learning.

[47]  Raghu Ramakrishnan,et al.  Proceedings : KDD 2000 : the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 20-23, 2000, Boston, MA, USA , 2000 .

[48]  Peter Clark,et al.  Rule Induction with CN2: Some Recent Improvements , 1991, EWSL.

[49]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[50]  M. Friedman A Comparison of Alternative Tests of Significance for the Problem of $m$ Rankings , 1940 .

[51]  Eyke Hüllermeier,et al.  IBLStreams: a system for instance-based classification and regression on data streams , 2012, Evol. Syst..

[52]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[53]  João Gama,et al.  Very Fast Decision Rules for multi-class problems , 2012, SAC '12.