Cluster-based ensemble of classifiers

This paper presents cluster-based ensemble classifier – an approach toward generating ensemble of classifiers using multiple clusters within classified data. Clustering is incorporated to partition data set into multiple clusters of highly correlated data that are difficult to separate otherwise and different base classifiers are used to learn class boundaries within the clusters. As the different base classifiers engage on different difficult-to-classify subsets of the data, the learning of the base classifiers is more focussed and accurate. A selection rather than fusion approach achieves the final verdict on patterns of unknown classes. The impact of clustering on the learning parameters and accuracy of a number of learning algorithms including neural network, support vector machine, decision tree and k-NN classifier is investigated. A number of benchmark data sets from the UCI machine learning repository were used to evaluate the cluster-based ensemble classifier and the experimental results demonstrate its superiority over bagging and boosting.

[1]  Ernest Valveny,et al.  Optimal Classifier Fusion in a Non-Bayesian Probabilistic Framework , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Robi Polikar,et al.  Learn$^{++}$ .NC: Combining Ensemble of Classifiers With Dynamically Weighted Consult-and-Vote for Efficient Incremental Learning of New Classes , 2009, IEEE Transactions on Neural Networks.

[3]  David J. Sheskin,et al.  Handbook of Parametric and Nonparametric Statistical Procedures , 1997 .

[4]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[5]  Lior Rokach,et al.  Space Decomposition in Data Mining: A Clustering Approach , 2002, ISMIS.

[6]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[7]  Daniel Hernández-Lobato,et al.  An Analysis of Ensemble Pruning Techniques Based on Ordered Aggregation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[9]  Horst Bischof,et al.  Regularized multi-class semi-supervised boosting , 2009, CVPR.

[10]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[11]  Saso Dzeroski,et al.  Combining Classifiers with Meta Decision Trees , 2003, Machine Learning.

[12]  Nicolás García-Pedrajas,et al.  Constructing Ensembles of Classifiers by Means of Weighted Instance Selection , 2009, IEEE Transactions on Neural Networks.

[13]  Lefteris Angelis,et al.  Clustering classifiers for knowledge discovery from physically distributed databases , 2004, Data Knowl. Eng..

[14]  Hakan Cevikalp,et al.  Local Classifier Weighting by Quadratic Programming , 2008, IEEE Transactions on Neural Networks.

[15]  Malcolm I. Heywood,et al.  Input partitioning to mixture of experts , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).

[16]  R. Polikar,et al.  Ensemble based systems in decision making , 2006, IEEE Circuits and Systems Magazine.

[17]  Luiz Eduardo Soares de Oliveira,et al.  Pairwise fusion matrix for combining classifiers , 2007, Pattern Recognit..

[18]  John A. Richards,et al.  Cluster-space classification: a fast k-nearest neighbour classification for remote sensing hyperspectral data , 2003, IEEE Workshop on Advances in Techniques for Analysis of Remotely Sensed Data, 2003.

[19]  Ludmila I. Kuncheva,et al.  Clustering-and-selection model for classifier combination , 2000, KES'2000. Fourth International Conference on Knowledge-Based Intelligent Engineering Systems and Allied Technologies. Proceedings (Cat. No.00TH8516).

[20]  Akira Shimazu,et al.  Combining classifiers for word sense disambiguation based on Dempster-Shafer theory and OWA operators , 2007, Data Knowl. Eng..

[21]  James C. Bezdek,et al.  Decision templates for multiple classifier fusion: an experimental comparison , 2001, Pattern Recognit..

[22]  Jiri Matas,et al.  On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[23]  Mohamed S. Kamel,et al.  A generalized adaptive ensemble generation and aggregation approach for multiple classifier systems , 2009, Pattern Recognit..

[24]  Robert E. Schapire,et al.  The strength of weak learnability , 1990, Mach. Learn..

[25]  Grigorios Tsoumakas,et al.  Clustering based multi-label classification for image annotation and retrieval , 2009, 2009 IEEE International Conference on Systems, Man and Cybernetics.

[26]  Loris Nanni,et al.  FuzzyBagging: A novel ensemble of classifiers , 2006, Pattern Recognit..

[27]  Hyun-Chul Kim,et al.  Pattern classification using support vector machine ensemble , 2002, Object recognition supported by user interaction for service robots.

[28]  Mohamed S. Kamel,et al.  Adaptive fusion and co-operative training for classifier ensembles , 2006, Pattern Recognit..

[29]  Juan José Rodríguez Diez,et al.  Boosting recombined weak classifiers , 2008, Pattern Recognit. Lett..

[30]  Robi Polikar,et al.  An Ensemble-Based Incremental Learning Approach to Data Fusion , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).