Pairwise combination of classifiers for ensemble learning on data streams

This work presents two different voting strategies for ensemble learning on data streams based on pairwise combination of component classifiers. Despite efforts to build a diverse ensemble, there is always some degree of overlap between component classifiers models. Our voting strategies are aimed at using these overlaps to support ensemble prediction. We hypothesize that by combining pairs of classifiers it is possible to alleviate incorrect individual predictions that would otherwise negatively impact the overall ensemble decision. The first strategy, Pairwise Accuracy (PA), combines the shared accuracy estimation of all possible pairs in the ensemble, while the second strategy, Pairwise Patterns (PP), record patterns of pairwise decisions during training and use these patterns during prediction. We present empirical results comparing ensemble classifiers with their original voting methods and our proposed methods in both real and synthetic datasets, with and without concept drifts. Our analysis indicates that pairwise voting is able to enhance overall performance for PP, especially on real datasets, and that PA is useful whenever there are noticeable differences in accuracy estimates among ensemble members, which is common during concept drifts.

[1]  Robert P. W. Duin,et al.  Limits on the majority vote accuracy in classifier fusion , 2003, Pattern Analysis & Applications.

[2]  M. Harries SPLICE-2 Comparative Evaluation: Electricity Pricing , 1999 .

[3]  Geoff Hulten,et al.  Mining high-speed data streams , 2000, KDD '00.

[4]  William Nick Street,et al.  A streaming ensemble algorithm (SEA) for large-scale classification , 2001, KDD '01.

[5]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[6]  Geoff Holmes,et al.  MOA: Massive Online Analysis , 2010, J. Mach. Learn. Res..

[7]  Ludmila I. Kuncheva,et al.  Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy , 2003, Machine Learning.

[8]  Heitor Murilo Gomes,et al.  SAE2: advances on the social adaptive ensemble classifier for data streams , 2014, SAC.

[9]  Heitor Murilo Gomes,et al.  SAE: Social Adaptive Ensemble classifier for data streams , 2013, 2013 IEEE Symposium on Computational Intelligence and Data Mining (CIDM).

[10]  Tomasz Imielinski,et al.  Database Mining: A Performance Perspective , 1993, IEEE Trans. Knowl. Data Eng..

[11]  Jean Paul Barddal,et al.  SFNClassifier: a scale-free social network method to handle concept drift , 2014, SAC.

[12]  Marcus A. Maloof,et al.  Dynamic Weighted Majority: An Ensemble Method for Drifting Concepts , 2007, J. Mach. Learn. Res..

[13]  Thierry Denoeux,et al.  Pairwise classifier combination using belief functions , 2007, Pattern Recognit. Lett..

[14]  Geoff Holmes,et al.  New ensemble methods for evolving data streams , 2009, KDD.

[15]  Geoff Holmes,et al.  Evaluation methods and decision theory for classification of streaming data with temporal dependence , 2015, Machine Learning.

[16]  Grigorios Tsoumakas,et al.  An adaptive personalized news dissemination system , 2009, Journal of Intelligent Information Systems.

[17]  Philip S. Yu,et al.  Mining concept-drifting data streams using ensemble classifiers , 2003, KDD '03.

[18]  Geoff Hulten,et al.  Mining time-changing data streams , 2001, KDD '01.

[19]  Robert Tibshirani,et al.  Classification by Pairwise Coupling , 1997, NIPS.

[20]  Stuart J. Russell,et al.  Online bagging and boosting , 2005, 2005 IEEE International Conference on Systems, Man and Cybernetics.

[21]  Jerzy Stefanowski,et al.  Combining block-based and online methods in learning ensembles from concept drifting data streams , 2014, Inf. Sci..

[22]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[23]  João Gama,et al.  Issues in evaluation of stream learning algorithms , 2009, KDD.

[24]  Ricard Gavaldà,et al.  Learning from Time-Changing Data with Adaptive Windowing , 2007, SDM.

[25]  Manfred K. Warmuth,et al.  The Weighted Majority Algorithm , 1994, Inf. Comput..

[26]  Geoff Holmes,et al.  Leveraging Bagging for Evolving Data Streams , 2010, ECML/PKDD.

[27]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..