An analysis on the relationship between uncertainty and misclassification rate of classifiers

Abstract This paper provides new insight into the analysis on the relationship between uncertainty and misclassification of a classifier. We formulate the relationship explicitly by taking entropy as a measurement of uncertainty and by analyzing the misclassification rate based on the membership degree difference. Focusing on binary classification problems, this study theoretically and experimentally validates that the misclassification rate will definitely be upgrading with the increase of uncertainty if two conditions are satisfied: (1) the distributions of two classes based on membership degree difference are unimodal, and (2) these two distributions attain peaks when the membership degree difference is less and larger than zero, respectively. This work aims to provide some practical guidelines for improving classifier performance through clearly expressing and understanding the relationship between uncertainty and misclassification of a classifier.

[1]  Xizhao Wang,et al.  Maximum Ambiguity-Based Sample Selection in Fuzzy Decision Tree Induction , 2012, IEEE Transactions on Knowledge and Data Engineering.

[2]  Sheng-De Wang,et al.  Fuzzy support vector machines , 2002, IEEE Trans. Neural Networks.

[3]  T.,et al.  Training Feedforward Networks with the Marquardt Algorithm , 2004 .

[4]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[5]  Lisha Hu,et al.  A new and informative active learning approach for support vector machine , 2013, Inf. Sci..

[6]  Oscar Castillo,et al.  Optimal design of fuzzy classification systems using PSO with dynamic parameter adaptation through fuzzy logic , 2013, Expert Syst. Appl..

[7]  Davide Anguita,et al.  Using Unsupervised Analysis to Constrain Generalization Bounds for Support Vector Classifiers , 2010, IEEE Transactions on Neural Networks.

[8]  Huanhuan Chen,et al.  When does Diversity Help Generalization in Classification Ensembles? , 2019, ArXiv.

[9]  Dilip Sarkar,et al.  Randomness in generalization ability: a source to improve it , 1996, IEEE Trans. Neural Networks.

[10]  Xi-Zhao Wang,et al.  Intuitionistic Fuzzy Twin Support Vector Machines , 2019, IEEE Transactions on Fuzzy Systems.

[11]  Weifeng Liu,et al.  Adaptive and Learning Systems for Signal Processing, Communication, and Control , 2010 .

[12]  Manfred K. Warmuth,et al.  The Weighted Majority Algorithm , 1994, Inf. Comput..

[13]  Fuchun Sun,et al.  Cross-modal learning for material perception using deep extreme learning machine , 2020, Int. J. Mach. Learn. Cybern..

[14]  Konstantinos G. Margaritis,et al.  Managing the computational cost of model selection and cross-validation in extreme learning machines via Cholesky, SVD, QR and eigen decompositions , 2018, Neurocomputing.

[15]  Korris Fu-Lai Chung,et al.  Extreme vector machine for fast training on large data , 2020, Int. J. Mach. Learn. Cybern..

[16]  Gavin C. Cawley,et al.  On Over-fitting in Model Selection and Subsequent Selection Bias in Performance Evaluation , 2010, J. Mach. Learn. Res..

[17]  Yoav Freund,et al.  Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[18]  Bernardete Ribeiro,et al.  Improving the Generalization Capacity of Cascade Classifiers , 2013, IEEE Transactions on Cybernetics.

[19]  Ran Wang,et al.  An off-center technique: Learning a feature transformation to improve the performance of clustering and classification , 2019, Inf. Sci..

[20]  D. T. Pham,et al.  Estimation and generation of training patterns for control chart pattern recognition , 2016, Comput. Ind. Eng..

[21]  David G. Stork,et al.  Pattern Classification , 1973 .

[22]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[23]  Cezary Z. Janikow,et al.  Fuzzy decision trees: issues and methods , 1998, IEEE Trans. Syst. Man Cybern. Part B.

[24]  Lei Zhu,et al.  Software change‐proneness prediction through combination of bagging and resampling methods , 2018, J. Softw. Evol. Process..

[25]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.

[26]  Oscar Castillo,et al.  An Extension of the Fuzzy Possibilistic Clustering Algorithm Using Type-2 Fuzzy Logic Techniques , 2017, Adv. Fuzzy Syst..

[27]  Xi-Zhao Wang,et al.  Improving Generalization of Fuzzy IF--THEN Rules by Maximizing Fuzzy Entropy , 2009, IEEE Transactions on Fuzzy Systems.

[28]  Shengli Wu,et al.  Effective Neural Network Ensemble Approach for Improving Generalization Performance , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[29]  M. Stone Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[30]  Li Zhao,et al.  Learning from correlation with extreme learning machine , 2019, Int. J. Mach. Learn. Cybern..

[31]  Daniel S. Yeung,et al.  Quantitative study on the generalization error of multiple classifier systems , 2005, 2005 IEEE International Conference on Systems, Man and Cybernetics.

[32]  Ran Wang,et al.  Noniterative Deep Learning: Incorporating Restricted Boltzmann Machine Into Multilayer Random Weight Neural Networks , 2019, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[33]  Witold Pedrycz,et al.  A Study on Relationship Between Generalization Abilities and Fuzziness of Base Classifiers in Ensemble Learning , 2015, IEEE Transactions on Fuzzy Systems.

[34]  Xizhao Wang,et al.  A review on neural networks with random weights , 2018, Neurocomputing.

[35]  Ran Wang,et al.  Discovering the Relationship Between Generalization and Uncertainty by Incorporating Complexity of Classification , 2018, IEEE Transactions on Cybernetics.

[36]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[37]  Xizhao Wang,et al.  Segment Based Decision Tree Induction With Continuous Valued Attributes , 2015, IEEE Transactions on Cybernetics.

[38]  Claudio Fuentes,et al.  Error-rate estimation in discriminant analysis of non-linear longitudinal data: A comparison of resampling methods , 2018, Statistical methods in medical research.

[39]  Mohamad T. Musavi,et al.  On the Generalization Ability of Neural Network Classifiers , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[40]  Xinlei Zhou,et al.  Model tree pruning , 2019, International Journal of Machine Learning and Cybernetics.

[41]  Yunfei Ye,et al.  A nonlinear kernel support matrix machine for matrix learning , 2017, Int. J. Mach. Learn. Cybern..

[42]  Yitian Xu,et al.  Accelerating improved twin support vector machine with safe screening rule , 2019, Int. J. Mach. Learn. Cybern..

[43]  Oscar Castillo,et al.  Fuzzy Classification System Design Using PSO with Dynamic Parameter Adaptation Through Fuzzy Logic , 2015, Fuzzy Logic Augmentation of Nature-Inspired Optimization Metaheuristics.

[44]  J. T. Spooner,et al.  Adaptive and Learning Systems for Signal Processing, Communications, and Control , 2006 .