Optimized Classification Predictions with a New Index Combining Machine Learning Algorithms

Voting is a commonly used ensemble method aiming to optimize classification predictions by combining results from individual base classifiers. However, the selection of appropriate classifiers to participate in voting algorithm is currently an open issue. In this study we developed a novel Dissimilarity-Performance (DP) index which incorporates two important criteria for the selection of base classifiers to participate in voting: their differential response in classification (dissimilarity) when combined in triads and their individual performance. To develop this empirical index we firstly used a range of different datasets to evaluate the relationship between voting results and measures of dissimilarity among classifiers of different types (rules, trees, lazy classifiers, functions and Bayes). Secondly, we computed the combined effect on voting performance of classifiers with different individual performance and/or diverse results in the voting performance. Our DP index was able to rank the classifier combinations according to their voting performance and thus to suggest the optimal combination. The proposed index is recommended for individual machine learning users as a preliminary tool to identify which classifiers to combine in order to achieve more accurate classification predictions avoiding computer intensive and time-consuming search.

[1]  George Tsirtsis,et al.  Ecological quality scales based on phytoplankton for the implementation of Water Framework Directive in the Eastern Mediterranean , 2010 .

[2]  John A. Stankovic,et al.  Detection of Chronic Kidney Disease and Selecting Important Predictive Attributes , 2016, 2016 IEEE International Conference on Healthcare Informatics (ICHI).

[3]  S. Cessie,et al.  Ridge Estimators in Logistic Regression , 1992 .

[4]  Ioannis M. Stephanakis,et al.  Combined Classification of Risk Factors for Appendicitis Prediction in Childhood , 2013, EANN.

[5]  Jianyong Wang,et al.  HARMONY: Efficiently Mining the Best Rules for Classification , 2005, SDM.

[6]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[7]  Nathalie Japkowicz,et al.  Supervised Versus Unsupervised Binary-Learning by Feedforward Neural Networks , 2004, Machine Learning.

[8]  Sotiris B. Kotsiantis,et al.  Machine learning: a review of classification and combining techniques , 2006, Artificial Intelligence Review.

[9]  Guang-Bin Huang,et al.  Trends in extreme learning machines: A review , 2015, Neural Networks.

[10]  Kevin Barraclough,et al.  I and i , 2001, BMJ : British Medical Journal.

[11]  Grigorios Tsoumakas,et al.  Effective Voting of Heterogeneous Classifiers , 2004, ECML.

[12]  Robert C. Holte,et al.  Very Simple Classification Rules Perform Well on Most Commonly Used Datasets , 1993, Machine Learning.

[13]  D. Opitz,et al.  Popular Ensemble Methods: An Empirical Study , 1999, J. Artif. Intell. Res..

[14]  Vikas Singh,et al.  Ensemble clustering using semidefinite programming with applications , 2010, Machine Learning.

[15]  Peter L. M. Goethals,et al.  Application of classification trees and support vector machines to model the presence of macroinvertebrates in rivers in Vietnam , 2010, Ecol. Informatics.

[16]  Zhi-Hua Zhou,et al.  Ensemble Methods: Foundations and Algorithms , 2012 .

[17]  Theodoros Iliou,et al.  Towards Emotion Recognition from Speech: Definition, Problems and the Materials of Research , 2010, Semantics in Adaptive and Personalized Services.

[18]  Hisao Ishibuchi,et al.  Voting in fuzzy rule-based systems for pattern classification problems , 1999, Fuzzy Sets Syst..

[19]  Noel E. Sharkey,et al.  Combining diverse neural nets , 1997, The Knowledge Engineering Review.

[20]  G. Yule On the Association of Attributes in Statistics: With Illustrations from the Material of the Childhood Society, &c , 1900 .

[21]  Bernard Zenko,et al.  Is Combining Classifiers with Stacking Better than Selecting the Best One? , 2004, Machine Learning.

[22]  M. Stone Cross-validation and multinomial prediction , 1974 .

[23]  Mykola Pechenizkiy,et al.  Diversity in search strategies for ensemble feature selection , 2005, Inf. Fusion.

[24]  Zhuan Liu,et al.  Ensemble selection by GRASP , 2013, Applied Intelligence.

[25]  Eric Bauer,et al.  An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants , 1999, Machine Learning.

[26]  Bogdan Gabrys,et al.  Classifier selection for majority voting , 2005, Inf. Fusion.

[27]  Pierre Baldi,et al.  Assessing the accuracy of prediction algorithms for classification: an overview , 2000, Bioinform..

[28]  Reza Ebrahimpour,et al.  Improving Combination Methods of Neural Classifiers Using NCL , 2012 .

[29]  D. Kitsiou,et al.  Detection and classification of mesoscale atmospheric phenomena above sea in SAR imagery , 2015 .

[30]  Aytug Onan,et al.  A multiobjective weighted voting ensemble classifier based on differential evolution algorithm for text sentiment classification , 2016, Expert Syst. Appl..

[31]  David W. Aha,et al.  Instance-Based Learning Algorithms , 1991, Machine Learning.

[32]  Jian Su,et al.  Recognition of protein/gene names from text using an ensemble of classifiers , 2005, BMC Bioinformatics.

[33]  Aye Myat Myat Paing High Availability Solution: Resource Usage Management in Virtualized Software Aging , 2012 .

[34]  Galina L. Rogova,et al.  Combining the results of several neural network classifiers , 1994, Neural Networks.

[35]  Leo Lebanov,et al.  Random Forests machine learning applied to gas chromatography - Mass spectrometry derived average mass spectrum data sets for classification and characterisation of essential oils. , 2020, Talanta.

[36]  Haralambos Sarimveis,et al.  A fuzzy logic approach for the classification of product qualitative characteristics , 2002 .

[37]  Irena Koprinska,et al.  Learning to classify e-mail , 2007, Inf. Sci..

[38]  Ioannis Hatzilygeroudis,et al.  Recognizing emotions in text using ensemble of classifiers , 2016, Eng. Appl. Artif. Intell..

[39]  Sholom M. Weiss,et al.  Case studies in high-dimensional classification , 1994, Applied Intelligence.

[40]  Torsten Rohlfing,et al.  Performance-based classifier combination in atlas-based image segmentation using expectation-maximization parameter estimation , 2004, IEEE Transactions on Medical Imaging.

[41]  Paul Boersma,et al.  Praat: doing phonetics by computer , 2003 .

[42]  Bidyut Baran Chaudhuri,et al.  Efficient training and improved performance of multilayer perceptron in pattern classification , 2000, Neurocomputing.

[43]  Lawrence O. Hall,et al.  Ensemble diversity measures and their application to thinning , 2004, Inf. Fusion.

[44]  Daphne Koller,et al.  Toward Optimal Feature Selection , 1996, ICML.

[45]  Gökhan Gümüş,et al.  C4.5 Versus Other Decision Trees: A Review , 2015 .

[46]  Fabio Roli,et al.  An approach to the automatic design of multiple classifier systems , 2001, Pattern Recognit. Lett..

[47]  George C. Anastassopoulos,et al.  Optimizing voting classification using cluster analysis on medical diagnosis data , 2015, EANN '15.

[48]  Dimitris Bertsimas,et al.  Optimal classification trees , 2017, Machine Learning.

[49]  Ludmila I. Kuncheva,et al.  Relationships between combination methods and measures of diversity in combining classifiers , 2002, Inf. Fusion.

[50]  David Gil Méndez,et al.  Predicting seminal quality with artificial intelligence methods , 2012, Expert Syst. Appl..

[51]  George Karypis,et al.  Gene Classification Using Expression Profiles: A Feasibility Study , 2005, Int. J. Artif. Intell. Tools.

[52]  Tony R. Martinez,et al.  Finding the Real Differences Between Learning Algorithms , 2015, Int. J. Artif. Intell. Tools.

[53]  George C. Anastassopoulos,et al.  A methodology to carry out voting classification tasks using a particle swarm optimization-based neuro-fuzzy competitive learning network , 2016, Evolving Systems.

[54]  Pat Langley,et al.  Estimating Continuous Distributions in Bayesian Classifiers , 1995, UAI.

[55]  Shahin Tajik,et al.  Multi-sensor Finger Ring for Authentication Based on 3D Signatures , 2014, HCI.

[56]  Ian H. Witten,et al.  Generating Accurate Rule Sets Without Global Optimization , 1998, ICML.

[57]  Jui-Sheng Chou,et al.  Machine learning in concrete strength simulations: Multi-nation data analytics , 2014 .

[58]  Ryutaro Tateishi,et al.  Using geographically weighted variables for image classification , 2012 .

[59]  David Mouillot,et al.  Effects of pulsed nutrient inputs on phytoplankton assemblage structure and blooms in an enclosed coastal area , 2007 .

[60]  Asif Ekbal,et al.  Combining multiple classifiers using vote based classifier ensemble technique for named entity recognition , 2013, Data Knowl. Eng..

[61]  Esmaeil Hadavandi,et al.  A Neural Network Ensemble Classifier for Effective Intrusion Detection Using Fuzzy Clustering and Radial Basis Function Networks , 2016, Int. J. Artif. Intell. Tools.

[62]  Ludmila I. Kuncheva,et al.  Using diversity in cluster ensembles , 2004, 2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No.04CH37583).

[63]  M. Brescia,et al.  The detection of globular clusters in galaxies as a data mining problem , 2011 .

[64]  M P CanutoAnne,et al.  Investigating the influence of the choice of the ensemble members in accuracy and diversity of selection-based and fusion-based methods for ensembles , 2007 .

[65]  Ivanoe De Falco,et al.  Automatic Classification of Handsegmented Image Parts with Differential Evolution , 2006, EvoWorkshops.

[66]  J. Hazel,et al.  BINARY (PRESENCE-ABSENCE) SIMILARITY COEFFICIENTS , 1969 .

[67]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[68]  Fabio Roli,et al.  Design of Multiple Classifier Systems , 2002 .

[69]  Aik Choon Tan,et al.  Ensemble machine learning on gene expression data for cancer classification. , 2003, Applied bioinformatics.

[70]  Shih-Wei Lin,et al.  PSOLDA: A particle swarm optimization approach for enhancing classification accuracy rate of linear discriminant analysis , 2009, Appl. Soft Comput..

[71]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[72]  Thomas G. Dietterich Machine-Learning Research Four Current Directions , 1997 .

[73]  Jie Hu,et al.  Research of new strategies for improving CBR system , 2012, Artificial Intelligence Review.

[74]  Thomas G. Dietterich An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization , 2000, Machine Learning.

[75]  Ludmila I. Kuncheva,et al.  Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy , 2003, Machine Learning.

[76]  Sankar K. Pal,et al.  Multilayer perceptron, fuzzy sets, and classification , 1992, IEEE Trans. Neural Networks.

[77]  Subhash C. Bagui,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2005, Technometrics.

[78]  Nicolás García-Pedrajas,et al.  Supervised subspace projections for constructing ensembles of classifiers , 2012, Inf. Sci..

[79]  Wei Tang,et al.  Clusterer ensemble , 2006, Knowl. Based Syst..

[80]  Robert P. W. Duin,et al.  Limits on the majority vote accuracy in classifier fusion , 2003, Pattern Analysis & Applications.

[81]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[82]  B. V. Pawar,et al.  Comparison of Classification Algorithms using WEKA on Various Datasets , 2012 .

[83]  Astrid Paeschke,et al.  A database of German emotional speech , 2005, INTERSPEECH.

[84]  Melanie Hilario,et al.  Standard machine learning algorithms applied to UPLC-TOF/MS metabolic fingerprinting for the discovery of wound biomarkers in Arabidopsis thaliana , 2010 .

[85]  Hakan Altinçay,et al.  Classifier subset selection for biomedical named entity recognition , 2009, Applied Intelligence.

[86]  Paul M. Mather,et al.  An assessment of the effectiveness of decision tree methods for land cover classification , 2003 .

[87]  Appavu Balamurugan,et al.  An Empirical Study on Different Ranking Methods for Effective Data Classification , 2015 .

[88]  Robert P. W. Duin,et al.  Is independence good for combining classifiers? , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[89]  Vassilis G. Kaburlasos,et al.  A Lattice-Computing ensemble for reasoning based on formal fusion of disparate data types, and an industrial dispensing application , 2014, Inf. Fusion.

[90]  Shuiping Gou,et al.  Greedy optimization classifiers ensemble based on diversity , 2011, Pattern Recognit..

[91]  Emilio Corchado,et al.  A survey of multiple classifier systems as hybrid systems , 2014, Inf. Fusion.

[92]  John G. Cleary,et al.  K*: An Instance-based Learner Using and Entropic Distance Measure , 1995, ICML.

[93]  David W. Opitz,et al.  Actively Searching for an E(cid:11)ective Neural-Network Ensemble , 1996 .

[94]  Thomas G. Dietterich Machine-Learning Research , 1997, AI Mag..

[95]  Piotr A. Kowalski,et al.  Complete Gradient Clustering Algorithm for Features Analysis of X-Ray Images , 2010 .

[96]  Jiri Matas,et al.  On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[97]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[98]  Ethem Alpaydin,et al.  Incremental construction of classifier and discriminant ensembles , 2009, Inf. Sci..

[99]  Anne M. P. Canuto,et al.  Investigating the influence of the choice of the ensemble members in accuracy and diversity of selection-based and fusion-based methods for ensembles , 2007, Pattern Recognit. Lett..

[100]  Yinghuan Shi,et al.  Transductive cost-sensitive lung cancer image classification , 2012, Applied Intelligence.

[101]  Yong Yin,et al.  Similarity coefficient methods applied to the cell formation problem: a comparative investigation , 2005, Comput. Ind. Eng..

[102]  Sotiris B. Kotsiantis,et al.  Supervised Machine Learning: A Review of Classification Techniques , 2007, Informatica.

[103]  Qihao Weng,et al.  A survey of image classification methods and techniques for improving classification performance , 2007 .

[104]  G. Yule,et al.  On the association of attributes in statistics, with examples from the material of the childhood society, &c , 1900, Proceedings of the Royal Society of London.

[105]  N. Simboura,et al.  A synthesis of the biological quality elements for the implementation of the European Water Framework Directive in the Mediterranean ecoregion: The case of Saronikos Gulf , 2005 .

[106]  Paulo Cortez,et al.  Modeling wine preferences by data mining from physicochemical properties , 2009, Decis. Support Syst..