Multilayer Ensemble Pruning via Novel Multi-sub-swarm Particle Swarm Optimization

Recently, classifier ensemble methods are gaining more and more attention in the machine-learning and data-mining communities. In most cases, the performance of an ensemble is better than a single classifier. Many methods for creating diverse classifiers were developed during the past decade. When these diverse classifiers are generated, it is important to select the proper base classifier to join the ensemble. Usually, this selection process is called pruning the ensemble. In general, the ensemble pruning is a selection process in which an optimal combination will be selected from many existing base classifiers. Some base classifiers containing useful information may be excluded in this pruning process. To avoid this problem, the multilayer ensemble pruning model is used in this paper. In this model, the pruning of one layer can be seen as a multimodal optimization problem. A novel multi-sub-swarm particle swarm optimization (MSSPSO) is used here to find multi-solutions for this multilayer ensemble pruning model. In this model, each base classifier will generate an oracle output. Each layer will use MSSPSO algorithm to generate a different pruning based on previous oracle output. A series of experiments using UCI dataset is conducted, the experimental results show that the multilayer ensemble pruning via MSSPSO algorithm can improve the generalization performance of the multi-classifiers ensemble system. Besides, the experimental results show a relationship between the diversity and the pruning technique.

[1]  L. Breiman Arcing Classifiers , 1998 .

[2]  Paul Davidsson,et al.  Evaluating learning algorithms and classifiers , 2007, Int. J. Intell. Inf. Database Syst..

[3]  Thomas G. Dietterich,et al.  Pruning Adaptive Boosting , 1997, ICML.

[4]  Galina L. Rogova,et al.  Combining the results of several neural network classifiers , 1994, Neural Networks.

[5]  T. L. Fine,et al.  Ensemble pruning algorithms for accelerated training , 1996, Proceedings of International Conference on Neural Networks (ICNN'96).

[6]  Michael R. Lyu,et al.  A hybrid particle swarm optimization-back-propagation algorithm for feedforward neural network training , 2007, Appl. Math. Comput..

[7]  Robert P. W. Duin,et al.  The combining classifier: to train or not to train? , 2002, Object recognition supported by user interaction for service robots.

[8]  Michael R. Lyu,et al.  A hybrid particle swarm optimization-back-propagation algorithm for feedforward neural network training , 2007, Appl. Math. Comput..

[9]  Xindong Wu,et al.  Support vector machines based on K-means clustering for real-time business intelligence systems , 2005, Int. J. Bus. Intell. Data Min..

[10]  Jiri Matas,et al.  On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Riccardo Poli,et al.  Particle swarm optimization , 1995, Swarm Intelligence.

[12]  Russell C. Eberhart,et al.  A discrete binary version of the particle swarm algorithm , 1997, 1997 IEEE International Conference on Systems, Man, and Cybernetics. Computational Cybernetics and Simulation.

[13]  Padraig Cunningham,et al.  Using Diversity in Preparing Ensembles of Classifiers Based on Different Feature Subsets to Minimize Generalization Error , 2001, ECML.

[14]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[15]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[16]  Yue Shi,et al.  A modified particle swarm optimizer , 1998, 1998 IEEE International Conference on Evolutionary Computation Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98TH8360).

[17]  Ludmila I. Kuncheva,et al.  Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy , 2003, Machine Learning.

[18]  D. Ruta Multilayer Selection-Fusion Model for Pattern Classification , 2003 .

[19]  Bogdan Gabrys,et al.  Analysis of the Correlation Between Majority Voting Error and the Diversity Measures in Multiple Classifier Systems , 2001 .

[20]  De-Shuang Huang,et al.  Multi-sub-swarm particle swarm optimization algorithm for multimodal function optimization , 2007, 2007 IEEE Congress on Evolutionary Computation.

[21]  R. K. Ursem Multinational evolutionary algorithms , 1999, Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406).

[22]  Bogdan Gabrys,et al.  Classifier selection for majority voting , 2005, Inf. Fusion.

[23]  Robert P. W. Duin,et al.  An experimental study on diversity for bagging and boosting with linear classifiers , 2002, Inf. Fusion.

[24]  D.-S. Huang,et al.  Radial Basis Probabilistic Neural Networks: Model and Application , 1999, Int. J. Pattern Recognit. Artif. Intell..

[25]  Xin Yao,et al.  Evolving artificial neural networks , 1999, Proc. IEEE.

[26]  Michael R. Lyu,et al.  A novel adaptive sequential niche technique for multimodal function optimization , 2006, Neurocomputing.

[27]  Wei Tang,et al.  Selective Ensemble of Decision Trees , 2003, RSFDGrC.

[28]  Yoav Freund,et al.  Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[29]  De-Shuang Huang The United Adaptive Learning Algorithm for The Link Weights and Shape Parameter in RBFN for Pattern Recognition , 1997, Int. J. Pattern Recognit. Artif. Intell..

[30]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[31]  A. Agresti,et al.  Categorical Data Analysis , 1991, International Encyclopedia of Statistical Science.

[32]  Wei Tang,et al.  Ensembling neural networks: Many could be better than all , 2002, Artif. Intell..

[33]  Noel E. Sharkey,et al.  Combining diverse neural nets , 1997, The Knowledge Engineering Review.

[34]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[35]  Russell C. Eberhart,et al.  Human tremor analysis using particle swarm optimization , 1999, Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406).

[36]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[37]  Ludmila I. Kuncheva,et al.  Relationships between combination methods and measures of diversity in combining classifiers , 2002, Inf. Fusion.

[38]  D. Agrafiotis,et al.  Feature selection for structure-activity correlation using binary particle swarms. , 2002, Journal of medicinal chemistry.