Pruning the Ensemble of ANN Based on Decision Tree Induction

Ensemble learning is a powerful approach for achieving more accurate predictions compared with single classifier. However, this powerful classification ability is achieved at the expense of heavy storage requirements and computational burdens on the ensemble. Ensemble pruning is a crucial step for the reduction of the predictive overhead without worsening the performance of original ensemble. This paper suggests an efficient and effective ordering-based ensemble pruning based on the induction of decision tree. The suggested method maps the dataset and base classifiers to a new dataset where the ensemble pruning can be transformed to a feature selection problem. Furthermore, a set of accurate, diverse and complementary base classifiers can be selected by the induction of decision tree. Moreover, an evaluation function that deliberately favors the candidate sub-ensembles with an improved performance in classifying low margin instances has also been designed. The comparative experiments on 24 benchmark datasets demonstrate the effectiveness of our proposed method.

[1]  J. L. Hodges,et al.  Rank Methods for Combination of Independent Experiments in Analysis of Variance , 1962 .

[2]  William Nick Street,et al.  Ensemble Pruning Via Semi-definite Programming , 2006, J. Mach. Learn. Res..

[3]  Huaxiang Zhang,et al.  A spectral clustering based ensemble pruning approach , 2014, Neurocomputing.

[4]  Ludmila I. Kuncheva,et al.  Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy , 2003, Machine Learning.

[5]  Yoav Freund,et al.  Large Margin Classification Using the Perceptron Algorithm , 1998, COLT.

[6]  Qinghua Hu,et al.  Large-margin nearest neighbor classifiers via sample weight learning , 2011, Neurocomputing.

[7]  Luiz Eduardo Soares de Oliveira,et al.  Dynamic selection of classifiers - A comprehensive review , 2014, Pattern Recognit..

[8]  Gunnar Rätsch,et al.  Efficient Margin Maximizing with Boosting , 2005, J. Mach. Learn. Res..

[9]  Djamel Bouchaffra,et al.  An efficient ensemble pruning approach based on simple coalitional games , 2017, Inf. Fusion.

[10]  Wang Xiao,et al.  An effective ensemble pruning algorithm based on frequent patterns , 2014 .

[11]  Qun Dai,et al.  An efficient ordering-based ensemble pruning algorithm via dynamic programming , 2015, Applied Intelligence.

[12]  Hadi Sadoghi Yazdi,et al.  Making Diversity Enhancement Based on Multiple Classifier System by Weight Tuning , 2012, Neural Processing Letters.

[13]  Bo Sun,et al.  An empirical margin explanation for the effectiveness of DECORATE ensemble learning algorithm , 2015, Knowl. Based Syst..

[14]  Grigorios Tsoumakas,et al.  Pruning an ensemble of classifiers via reinforcement learning , 2009, Neurocomputing.

[15]  William B. Yates,et al.  Engineering Multiversion Neural-Net Systems , 1996, Neural Computation.

[16]  Vasudha Bhatnagar,et al.  Towards an optimally pruned classifier ensemble , 2014, International Journal of Machine Learning and Cybernetics.

[17]  Daniel Hernández-Lobato,et al.  An Analysis of Ensemble Pruning Techniques Based on Ordered Aggregation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[19]  Terry Windeatt,et al.  Pruning of Error Correcting Output Codes by optimization of accuracy–diversity trade off , 2014, Machine Learning.

[20]  Kaizhu Huang,et al.  A novel classifier ensemble method with sparsity and diversity , 2014, Neurocomputing.

[21]  Mihaela van der Schaar,et al.  Context-based unsupervised ensemble learning and feature ranking , 2016, Machine Learning.

[22]  Zhi-Hua Zhou,et al.  A Refined Margin Analysis for Boosting Algorithms via Equilibrium Margin , 2011, J. Mach. Learn. Res..

[23]  Xin Yao,et al.  An analysis of diversity measures , 2006, Machine Learning.

[24]  Raymond J. Mooney,et al.  Creating diversity in ensembles using artificial data , 2005, Inf. Fusion.

[25]  Fabio Roli,et al.  Design of effective multiple classifier systems by clustering of classifiers , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[26]  S. Holm A Simple Sequentially Rejective Multiple Test Procedure , 1979 .

[27]  Jiri Matas,et al.  On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[28]  Yoav Freund,et al.  Large Margin Classification Using the Perceptron Algorithm , 1998, COLT' 98.

[29]  Qun Dai,et al.  A novel ensemble pruning algorithm based on randomized greedy selective strategy and ballot , 2013, Neurocomputing.

[30]  Manuel Graña,et al.  A Two Stage Sequential Ensemble Applied to the Classification of Alzheimer’s Disease Based on MRI Features , 2011, Neural Processing Letters.

[31]  Qiang-Li Zhao,et al.  A fast ensemble pruning algorithm based on pattern mining process , 2009, Data Mining and Knowledge Discovery.

[32]  Wei Tang,et al.  Ensembling neural networks: Many could be better than all , 2002, Artif. Intell..

[33]  V. Vapnik,et al.  Bounds on Error Expectation for Support Vector Machines , 2000, Neural Computation.

[34]  Samia Boukir,et al.  Margin-based ordered aggregation for ensemble pruning , 2013, Pattern Recognit. Lett..

[35]  Chunhua Shen,et al.  Boosting Through Optimization of Margin Distributions , 2009, IEEE Transactions on Neural Networks.

[36]  Tom Heskes,et al.  Clustering ensembles of neural network models , 2003, Neural Networks.

[37]  Bartosz Krawczyk,et al.  Untrained weighted classifier combination with embedded ensemble pruning , 2016, Neurocomputing.

[38]  Mykola Pechenizkiy,et al.  Diversity in search strategies for ensemble feature selection , 2005, Inf. Fusion.

[39]  Vadim V. Strijov,et al.  Editorial of the special issue data analysis and intelligent optimization with applications , 2015, Machine Learning.

[40]  Gonzalo Martínez-Muñoz,et al.  Pruning in ordered bagging ensembles , 2006, ICML.

[41]  Qinghua Hu,et al.  Exploiting diversity for optimizing margin distribution in ensemble learning , 2014, Knowl. Based Syst..

[42]  Sin-Jin Lin,et al.  Multi-agent Architecture for Corporate Operating Performance Assessment , 2014, Neural Processing Letters.

[43]  Mohammad Alshayeb,et al.  Software defect prediction using ensemble learning on selected features , 2015, Inf. Softw. Technol..

[44]  Ludmila I. Kuncheva,et al.  Examining the Relationship Between Majority Vote Accuracy and Diversity in Bagging and Boosting , 2003 .

[45]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[46]  Xiao Wang,et al.  An effective ensemble pruning algorithm based on frequent patterns , 2014, Knowl. Based Syst..

[47]  Thomas G. Dietterich,et al.  Pruning Adaptive Boosting , 1997, ICML.

[48]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.