A kernel-based ensemble classifier for evolving stream of trees with double concept drifting reaction

Modern mining approaches should be able to properly deal with the increased availability of structured data. Here we focus on the problem of processing streams of trees. Specifically, we cope with classification tasks. We show that by adopting a double concept drifting reaction mechanism in the context of a kernel-based ensemble of classifiers, it is actually possible to have an effective and efficient system to process streams of trees. The original contribution consists into the introduction of a local concept drifting mechanism, specifically designed for structured data, and used to compute the ensemble score function in such a way to focus only on reliable (sub)trees belonging to the classification models which constitute the ensemble. Experimental results seem to support the relevance and usefulness of this local component for concept drifting management.

[1]  Alessandro Sperduti,et al.  Efficient Kernel-based Learning for Trees , 2007, 2007 IEEE Symposium on Computational Intelligence and Data Mining.

[2]  João Gama,et al.  Discretization from data streams: applications to histograms and data mining , 2006, SAC.

[3]  Philip S. Yu,et al.  Mining, Indexing, and Similarity Search in Graphs and Complex Structures , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[4]  Charu C. Aggarwal,et al.  On Classification of Graph Streams , 2011, SDM.

[5]  Alessandro Sperduti,et al.  Kernel-Based Selective Ensemble Learning for Streams of Trees , 2011, IJCAI.

[6]  Alessandro Sperduti,et al.  Fast On-line Kernel Learning for Trees , 2006, Sixth International Conference on Data Mining (ICDM'06).

[7]  Geoff Hulten,et al.  Mining high-speed data streams , 2000, KDD '00.

[8]  Shonali Krishnaswamy,et al.  Mining data streams: a review , 2005, SGMD.

[9]  Ricard Gavaldà,et al.  Adaptive XML Tree Classification on Evolving Data Streams , 2009, ECML/PKDD.

[10]  Ralf Klinkenberg,et al.  Learning drifting concepts: Example selection vs. example weighting , 2004, Intell. Data Anal..

[11]  Philip S. Yu,et al.  Mining concept-drifting data streams using ensemble classifiers , 2003, KDD '03.

[12]  Alessandro Sperduti,et al.  Mining Structured Data , 2010, IEEE Computational Intelligence Magazine.

[13]  Carlo Zaniolo,et al.  Fast and Light Boosting for Adaptive Mining of Data Streams , 2004, PAKDD.

[14]  Franco Turini,et al.  An Adaptive Selective Ensemble for Data Streams Classification , 2011, ICAART.

[15]  Franco Turini,et al.  Stream mining: a novel architecture for ensemble-based classification , 2011, Knowledge and Information Systems.

[16]  Geoff Holmes,et al.  New ensemble methods for evolving data streams , 2009, KDD.

[17]  Michael Collins,et al.  New Ranking Algorithms for Parsing and Tagging: Kernels over Discrete Structures, and the Voted Perceptron , 2002, ACL.