A novel ensemble pruning algorithm based on randomized greedy selective strategy and ballot

Although the Directed Hill Climbing Ensemble Pruning (DHCEP) algorithm has achieved favorable classification performance, it often yields suboptimal solutions to the ensemble pruning problem, due to its limited exploration within the whole solution space, which inspires us with the development of a novel Ensemble Pruning algorithm based on Randomized Greedy Selective Strategy and Ballot (RGSS&B-EP), where randomization technique is introduced into the procedure of greedy ensemble pruning, and the final pruned ensemble is generated by ballot, which are the two major contributions of this paper. Experimental results, including t-tests on the three benchmark classification tasks, verified the validity of the proposed RGSS&B-EP algorithm.

[1]  Alberto Suárez,et al.  Aggregation Ordering in Bagging , 2004 .

[2]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[3]  G DietterichThomas An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees , 2000 .

[4]  Grigorios Tsoumakas,et al.  An ensemble uncertainty aware measure for directed hill climbing ensemble pruning , 2010, Machine Learning.

[5]  Anders Krogh,et al.  Neural Network Ensembles, Cross Validation, and Active Learning , 1994, NIPS.

[6]  Thomas G. Dietterich,et al.  Pruning Adaptive Boosting , 1997, ICML.

[7]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[8]  Gunnar Rätsch,et al.  Soft Margins for AdaBoost , 2001, Machine Learning.

[9]  T. Mexia,et al.  Author ' s personal copy , 2009 .

[10]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[11]  Muhammad H. Alsuwaiyel,et al.  Algorithms - Design Techniques and Analysis , 1999, Lecture Notes Series on Computing.

[12]  Amanda J. C. Sharkey,et al.  Combining Artificial Neural Nets: Ensemble and Modular Multi-Net Systems , 1999 .

[13]  Salvatore J. Stolfo,et al.  Cost Complexity-Based Pruning of Ensemble Classifiers , 2001, Knowledge and Information Systems.

[14]  Rich Caruana,et al.  Getting the Most Out of Ensemble Selection , 2006, Sixth International Conference on Data Mining (ICDM'06).

[15]  Thomas G. Dietterich An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization , 2000, Machine Learning.

[16]  Mauricio G. C. Resende,et al.  Greedy Randomized Adaptive Search Procedures , 1995, J. Glob. Optim..

[17]  Yoav Freund,et al.  Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[18]  Eric Bauer,et al.  An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants , 1999, Machine Learning.

[19]  J. Ross Quinlan,et al.  Bagging, Boosting, and C4.5 , 1996, AAAI/IAAI, Vol. 1.

[20]  Robert E. Schapire,et al.  The Boosting Approach to Machine Learning An Overview , 2003 .

[21]  Daniel Hernández-Lobato,et al.  An Analysis of Ensemble Pruning Techniques Based on Ordered Aggregation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[23]  Oleksandr Makeyev,et al.  Neural network with ensembles , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[24]  Qun Dai,et al.  A competitive ensemble pruning approach based on cross-validation technique , 2013, Knowl. Based Syst..

[25]  Rich Caruana,et al.  Ensemble selection from libraries of models , 2004, ICML.

[26]  L. Breiman Arcing classifier (with discussion and a rejoinder by the author) , 1998 .

[27]  Lawrence O. Hall,et al.  Ensemble diversity measures and their application to thinning , 2004, Inf. Fusion.

[28]  L. Breiman Arcing Classifiers , 1998 .

[29]  Qun Dai,et al.  The build of n-Bits Binary Coding ICBP Ensemble System , 2011, Neurocomputing.