A new hybrid classifier selection model based on mRMR method and diversity measures

Classifier subset selection becomes an important stage in multiple classifier systems (MCSs) design to reduce the number of classifiers by eliminating the identical and inaccurate members. Minimum redundancy maximum relevance (mRMR) is a feature selection method that compromises between relevance and redundancy by obliterating similar members and keeping the most pertinent ones. In the current work, a novel classifier subset selection method based on mRMR method and diversity measures is proposed for building an efficient classifier ensemble. The proposed selection model suggested the greedy search algorithm using diversity-accuracy criteria to determine the optimal classifier set. The disagreement and Q-statistic measures are calculated to estimate the diversity among the members. Furthermore, the relevance is used as a means to determine the accuracy of the ensemble and its members. The experimental results over 24 datasets from the UCI repository and Kuncheva collection for real datasets are tested. The results established the efficiency of the proposed selection method with superior performance compared to the popular ensembles and several selection methods.

[1]  Luisa Micó,et al.  Comparison of Classifier Fusion Methods for Classification in Pattern Recognition Tasks , 2006, SSPR/SPR.

[2]  J. Ross Quinlan,et al.  Bagging, Boosting, and C4.5 , 1996, AAAI/IAAI, Vol. 1.

[3]  Chang-Dong Wang,et al.  Robust Ensemble Clustering Using Probability Trajectories , 2016, IEEE Transactions on Knowledge and Data Engineering.

[4]  Carla E. Brodley,et al.  Solving cluster ensemble problems by bipartite graph partitioning , 2004, ICML.

[5]  Jian Mi,et al.  Design of an HF-Band RFID System with Multiple Readers and Passive Tags for Indoor Mobile Robot Self-Localization , 2016, Sensors.

[6]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[7]  Ludmila I. Kuncheva,et al.  That Elusive Diversity in Classifier Ensembles , 2003, IbPRIA.

[8]  Kuo-Chen Chou,et al.  Predict and analyze S-nitrosylation modification sites with the mRMR and IFS approaches. , 2012, Journal of proteomics.

[9]  Basilio Sierra,et al.  Classifier Subset Selection for the Stacked Generalization Method Applied to Emotion Recognition in Speech , 2015, Sensors.

[10]  Ran Wang,et al.  Noniterative Deep Learning: Incorporating Restricted Boltzmann Machine Into Multilayer Random Weight Neural Networks , 2019, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[11]  Thomas G. Dietterich,et al.  Pruning Adaptive Boosting , 1997, ICML.

[12]  Chang-Dong Wang,et al.  Combining multiple clusterings via crowd agreement estimation and multi-granularity link analysis , 2014, Neurocomputing.

[13]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[14]  Driss Aboutajdine,et al.  A two-stage gene selection scheme utilizing MRMR filter and GA wrapper , 2011, Knowledge and Information Systems.

[15]  Mokhtar Sellami,et al.  OFF-LINE HANDWRITTEN WORD RECOGNITION USING ENSEMBLE OF CLASSIFIER SELECTION AND FEATURES FUSION , 2010 .

[16]  Nadir Farah,et al.  From static to dynamic ensemble of classifiers selection: Application to Arabic handwritten recognition , 2012, Int. J. Knowl. Based Intell. Eng. Syst..

[17]  Sylvain Piechowiak,et al.  On the Effectiveness of Diversity When Training Multiple Classifier Systems , 2009, ECSQARU.

[18]  Yang Yu,et al.  Diversity Regularized Ensemble Pruning , 2012, ECML/PKDD.

[19]  Eleazar Eskin,et al.  The Spectrum Kernel: A String Kernel for SVM Protein Classification , 2001, Pacific Symposium on Biocomputing.

[20]  Chang-Dong Wang,et al.  Ensemble clustering using factor graph , 2016, Pattern Recognit..

[21]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[22]  Ludmila I. Kuncheva,et al.  A Bound on Kappa-Error Diagrams for Analysis of Classifier Ensembles , 2013, IEEE Transactions on Knowledge and Data Engineering.

[23]  Nilanjan Dey,et al.  Classifier Ensemble Selection Based on mRMR Algorithm and Diversity Measures: An Application of Medical Data Classification , 2016, SOFA.

[24]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[25]  Abhishek Vaish,et al.  Information-Theoretic Measures on Intrinsic Mode Function for the Individual Identification Using EEG Sensors , 2015, IEEE Sensors Journal.

[26]  Chang-Dong Wang,et al.  Locally Weighted Ensemble Clustering , 2016, IEEE Transactions on Cybernetics.

[27]  Sam Kwong,et al.  Incorporating Diversity and Informativeness in Multiple-Instance Active Learning , 2017, IEEE Transactions on Fuzzy Systems.

[28]  Shuiping Gou,et al.  Greedy optimization classifiers ensemble based on diversity , 2011, Pattern Recognit..

[29]  N. Dey,et al.  Ensemble Classifiers Construction Using Diversity Measures and Random Subspace Algorithm Combination: Application to Glaucoma Diagnosis , 2016 .

[30]  Basilio Sierra,et al.  Classifier Subset Selection to construct multi-classifiers by means of estimation of distribution algorithms , 2015, Neurocomputing.

[31]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[32]  Joydeep Ghosh,et al.  Cluster Ensembles A Knowledge Reuse Framework for Combining Partitionings , 2002, AAAI/IAAI.

[33]  Mokhtar Sellami,et al.  Using Diversity in Classifier Set Selection for Arabic Handwritten Recognition , 2010, MCS.

[34]  Nilanjan Dey,et al.  Optimized Tumor Breast Cancer Classification Using Combining Random Subspace and Static Classifiers Selection Paradigms , 2016, Applications of Intelligent Optimization in Biology and Medicine.

[35]  Vikas Singh,et al.  Ensemble clustering using semidefinite programming with applications , 2010, Machine Learning.

[36]  Bartosz Krawczyk,et al.  Untrained weighted classifier combination with embedded ensemble pruning , 2016, Neurocomputing.

[37]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Gian Luca Foresti,et al.  Diversity-aware classifier ensemble selection via f-score , 2016, Inf. Fusion.

[39]  Liying Yang,et al.  Classifiers selection for ensemble learning based on accuracy and diversity , 2011 .

[40]  Ran Wang,et al.  Discovering the Relationship Between Generalization and Uncertainty by Incorporating Complexity of Classification , 2018, IEEE Transactions on Cybernetics.

[41]  Zhu Zhang,et al.  POS-RS: A Random Subspace method for sentiment classification based on part-of-speech analysis , 2015, Inf. Process. Manag..

[42]  Liam Paninski,et al.  Estimation of Entropy and Mutual Information , 2003, Neural Computation.

[43]  Sumaira Tasnim,et al.  Ensemble Classifiers and Their Applications: A Review , 2014, ArXiv.

[44]  Patrick P. K. Chan,et al.  Dynamic fusion method using Localized Generalization Error Model , 2012, Inf. Sci..

[45]  Jorma Laaksonen,et al.  Using diversity of errors for selecting members of a committee classifier , 2006, Pattern Recognit..

[46]  Hamid Parvin,et al.  Classifier Selection by Clustering , 2011, MCPR.

[47]  Haisheng Li,et al.  Random subspace evidence classifier , 2013, Neurocomputing.

[48]  Joydeep Ghosh,et al.  Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..

[49]  Ludmila I. Kuncheva,et al.  Clustering-and-selection model for classifier combination , 2000, KES'2000. Fourth International Conference on Knowledge-Based Intelligent Engineering Systems and Allied Technologies. Proceedings (Cat. No.00TH8516).

[50]  Junjie Wu,et al.  Spectral Ensemble Clustering , 2015, KDD.