Shared domains of competence of approximate learning models using measures of separability of classes

In this work we jointly analyze the performance of three classic Artificial Neural Network models and one Support Vector Machine with respect to a series of data complexity measures known as measures of separability of classes. In particular, we consider a Radial Basis Function Network, a Multi-Layer Perceptron, a Learning Vector Quantization, while the Sequential Minimal Optimization method is used to model the Support Vector Machine. We consider five measures of separability of classes over a wide range of data sets built from real data which have proved to be very discriminative when analyzing the performance of classifiers. We find that two of them allow us to extract common behavior patterns for the four learning methods due to their related nature. We obtain rules using these two metrics that describe both good or bad performance of the Artificial Neural Networks and the Support Vector Machine. With the obtained rules, we characterize the performance of the methods from the data set complexity metrics and therefore their common domains of competence are established. Using these domains of competence the shared good and bad capabilities of these four models can be used to know if the approximative models will perform well or poorly or if a more complex configuration of the model is needed for a given problem in advance.

[1]  Narasimhan Sundararajan,et al.  Risk-sensitive loss functions for sparse multi-category classification problems , 2008, Inf. Sci..

[2]  Tin Kam Ho,et al.  Domain of competence of XCS classifier system in complexity measurement space , 2005, IEEE Transactions on Evolutionary Computation.

[3]  Minrui Fei,et al.  A fast multi-output RBF neural network construction method , 2010, Neurocomputing.

[4]  James C. Bezdek,et al.  Nearest prototype classifier designs: An experimental study , 2001, Int. J. Intell. Syst..

[5]  Francisco Herrera,et al.  A study on the use of statistical tests for experimentation with neural networks: Analysis of parametric test conditions and non-parametric tests , 2007, Expert Syst. Appl..

[6]  Seppo J. Ovaska,et al.  So near and yet so far: New insight into properties of some well-known classifier paradigms , 2010, Inf. Sci..

[7]  Li Chunxia,et al.  Task decomposition and modular single-hidden-layer perceptron classifiers for multi-class learning problems , 2007, Pattern Recognit..

[8]  Martin Fodslette Meiller A Scaled Conjugate Gradient Algorithm for Fast Supervised Learning , 1993 .

[9]  Renato De Leone,et al.  Integrating support vector machines and neural networks , 2007, Neural Networks.

[10]  Thomas Villmann,et al.  Fuzzy classification using information theoretic learning vector quantization , 2008, Neurocomputing.

[11]  Martin D. Buhmann,et al.  Radial Basis Functions , 2021, Encyclopedia of Mathematical Geosciences.

[12]  Jianjun Wang,et al.  Margin calibration in SVM class-imbalanced learning , 2009, Neurocomputing.

[13]  Kazuyuki Murase,et al.  Faster Training Using Fusion of Activation Functions for Feed Forward Neural Networks , 2009, Int. J. Neural Syst..

[14]  Ravi Kothari,et al.  Feature subset selection using a new definition of classifiability , 2003, Pattern Recognit. Lett..

[15]  José Ramón Cano,et al.  Diagnose Effective Evolutionary Prototype Selection Using an Overlapping Measure , 2009, Int. J. Pattern Recognit. Artif. Intell..

[16]  D. Broomhead,et al.  Radial Basis Functions, Multi-Variable Functional Interpolation and Adaptive Networks , 1988 .

[17]  Michael Biehl,et al.  Performance analysis of LVQ algorithms: A statistical physics approach , 2006, Neural Networks.

[18]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[19]  Jue Wang,et al.  Recursive Support Vector Machines for Dimensionality Reduction , 2008, IEEE Transactions on Neural Networks.

[20]  Sameer Singh,et al.  Multiresolution Estimates of Classification Complexity , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[21]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[22]  Sammy Siu,et al.  Analysis of the Initial Values in Split-Complex Backpropagation Algorithm , 2008, IEEE Transactions on Neural Networks.

[23]  David S. Broomhead,et al.  Multivariable Functional Interpolation and Adaptive Networks , 1988, Complex Syst..

[24]  T. Ho,et al.  Data Complexity in Pattern Recognition , 2006 .

[25]  Zenglin Xu,et al.  A novel kernel-based maximum a posteriori classification method , 2009, Neural Networks.

[26]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[27]  Carlos Renjifo,et al.  Improving radial basis function kernel classification through incremental learning and automatic parameter selection , 2008, Neurocomputing.

[28]  Alexandros Kalousis,et al.  Algorithm selection via meta-learning , 2002 .

[29]  José Martínez Sotoca,et al.  An analysis of how training data complexity affects the nearest neighbor classifiers , 2007, Pattern Analysis and Applications.

[30]  Tin Kam Ho,et al.  Complexity Measures of Supervised Classification Problems , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[31]  Richard Baumgartner,et al.  Data complexity assessment in undersampled classification of high-dimensional biomedical data , 2006, Pattern Recognit. Lett..

[33]  B. John Oommen,et al.  On using prototype reduction schemes to enhance the computation of volume-based inter-class overlap measures , 2009, Pattern Recognit..

[34]  FRED W. SMITH,et al.  Pattern Classifier Design by Linear Programming , 1968, IEEE Transactions on Computers.

[35]  Mao Yi-bo Algorithm for Approximation Order of Multiscaling Function , 2004 .

[36]  Ricardo Vilalta,et al.  Metalearning - Applications to Data Mining , 2008, Cognitive Technologies.

[37]  José Martínez Sotoca,et al.  Data Characterization for Effective Prototype Selection , 2005, IbPRIA.

[38]  Francisco Herrera,et al.  Domains of competence of fuzzy rule based classification systems with data complexity measures: A case of study using a fuzzy hybrid genetic based machine learning method , 2010, Fuzzy Sets Syst..

[39]  Ming Dong,et al.  Classifiability based omnivariate decision trees , 2003, Proceedings of the International Joint Conference on Neural Networks, 2003..

[40]  Richard Weber,et al.  Simultaneous feature selection and classification using kernel-penalized support vector machines , 2011, Inf. Sci..

[41]  Raúl Rojas,et al.  Neural Networks - A Systematic Introduction , 1996 .

[42]  Yen-Jen Oyang,et al.  Data classification with radial basis function networks based on a novel kernel density estimation algorithm , 2005, IEEE Transactions on Neural Networks.

[43]  Martin D. Buhmann,et al.  Radial Basis Functions: Theory and Implementations: Preface , 2003 .